DOCUMENT R-ESOHE 



ED 11U>11pf 

AOTHOP' 
■ ilTLE 

"iNSTITOTION 

POB DATE 
NOTE. V 

EDFS PPICE 
DESCRXPTQ-PS 



IB 002 740 * . 

,Sichards, William D.^^ Jr. • , » ' . 

^ A* Manual for Network Analysis (Using the NEGOPI ■ 
Network: Analysis Program) . _ * ^ , 

Stanford 'Univ.. r Cfalif* Inst, for Communicatian 
Pesearch•^^ . • - . , 

Jun 75 , ' • • * ^ ^ . , 

9^P. . ' ^ „ . 

,MF-$0.76 HC-$a.U3 Plus Postag,e 

Argorit'hiD?; Computer PrDgfams; "*Inf or mation Theory; 
♦Intercpntmunication ; Models *Net works ; *Social 
' Sj^ems ; . *Topology 
KEGOPI; *Network Analysis Program 



IDENTIFIEPS 

ABSTRACT , • * . ' . ' * ' 

Network Analysis.^is an observational »system focusing 
oti the relationships of individuals within a system — their " subgroups, 
their leaders, and the frequency of their, .communications^ The " 
strength^ of felatibnshi-ps can be measUred by efxamining the incidfeilce 
and ffur'ation of intejrcommunicatioii^. Using data accu-mulat^d by 
gaiestioning each member of the , organization, the computer ' employs a 
presc3:.ibed set of algorithms, the Network ^A^nalysis Program (NEGOPY)^, 
to yield a topological iiiterpreta tion of individualsr subgroups, and 
inte comm\anicat ions . _ (EKH) ^ ^ ^ * 



> 



^iiliiiliiiliiilijliliiiliiiliiiliiiliiiliiiliiiliiilifiili^ 

* Documents acguited by ERIC include many informal qnpublished ' * 

* materials not available from other sojurces. EfilC makes every effort 

* to obtain the best copy available. Neve-xtheless, items pf marginal 

* reproducibility are often encountered and this affects the quality , 

* of the microfiche and hardcopy reproductions^'ERIC make3 available 

* via the ERIC Document Reproduction Service (EDRS) . EDRS is not 

* responsible for the quality of the original document. Reproductions.* 

* supplied by EDPS a-re*the >liest fhat can be made ^ from ^the'^origina 1. * 



ERIC 



L 



/ 



f 




.j^ report of tlje » - , 

INSTlTUTJg FOR . COMMUNICATION RESEARCH 
STANFORD UNrVERSIlfY ^ 



MEMC 



2 



A'WHTAL FOR ^IFTOORK ANALYSIS 
<USING THE NEGOPY NETWORK ANALYSIS PROGRAM) 



William D. Richards, Jr. 
Institute for C^inmunication Research 
Stanford University 

June 1975 



'4 



f 



r or mgaltm 

OUCED EX^*"" P< M '\, ' P 

TMF PE'*^O^f OP ' 1. ' 'On » 

AT nC/:*^ POi»*">C ' EA /R Ot'iNi' 



PERMiSS>OJ^TO REPftOOOCE T^^iS COPY 
RiGHTEO mTtERUL mas 8EEN GRANTED BY , 



TO ER|C AND ORGANIZATIONS OPERATING 
UNDER AGREEMENTS WITH THE NATtONAL IN- 
STITUTE OF EDUCATION FURIHER REf^RO 
OOCTtON OUTSIDE THE EPIC SYSTEM R£ 
QUIRES PERMISSION OP THE COP^R^GH^ 
OVjNER ' 



(c) / • 

^ Co^right 1975, William T). Richards, Jr. No part of this . document, 

with the exception of short quptations, may be copied 

or reproduced in any form without the prior written consent: of the author. 



, \ 3 



I V /T'-^lvn/OF CONTENTS 

•\ .Part i*ie-'T^e .goajs Network .Analysis...^ ;,.2 

Part two-The,.data. fpr Netiwk Analysis. 1 / ^[ , , ,8 

Relatiojiships • 3 

. Ratio level scales , , . : r ; H 

Combining two or more indicators : ... * 12 

Other instrumentation considerations ^15 

. Directionality/symmetry \ ,^ 

The actuaj collecftion of data ^ 18 

Part three- -.The* analysis ^ T./. 21 

I. The algorithm which defines groups t . ...... ! 22 

Drawing the tentative boundaries " .\./ : .30 

' \ ^ Using the cr,iteria for an exact solution !\ T. ..38^ 

II. NEGOPY: the^Network Analysis Program \ . . .3 46 

1. Data description 

ai reyiav." ; / ^48 

b. parameters...* 48, 

c. output , J \,55 

^ / 2. Preparation for group detection 

a. review • , 59 ^ 

b . . parameters ; 60 

/ c. output ■ ^ 61 

3. Initial group detection • ) 

. a . review 5I 

• • f b. parameters ^ 61 

c. output : : 63 

,4. Application of the fonnal criteria • , t 

^ a. review 64 

\ h. parameters - ■ .65 

c. output ; - ' , . 66 

. ' 1 ^ . ' f . 

5. Final iiesults and control of output . ' 

a. review I , 68 

* b. parameters y:.J.J 68 

' c. output . . 1 . ^\ . . 70 



ERLC 





/ 














f 


4 




! 



table of contents, continued 



P^t four—Using the Network* Analysis Program 75 

• 1. Setting, up a Netvsork .Ailalysis run^ 75 

System control cards 1 ^ 76 

- * NEGOPY control cards ^ , . . , : ^. , , , .77 

Data'card§ and data fonnat cards.-; ' 79 

Namelist carfis and riamelist format cards... \ 81 

II. ^liscellaneous aspects of running the program . * 

Error messages or warnings » .' 82 

Adjusting- tlie parameters to get better results....'. 83 

Known bugs '. : • '.^ 84 

Parameter List. . . ' 86 



ERIC 



5 



PART cm 

I 

IHE -GOALS OF Nm-ORK -ANALYSIS 

Without canmunication, there would be no social organizations. There 

would be no corporations, no hospitals, no universities, no societies. 
y 

The process o£ communication allows people to w*k together for some com- 
mon goal. It , allows p'eople to coordinate their "behaviors and to share 
» their feelings. Comniunication has been descyibod as the "thread** that 
holds organizations together, as the /'glue" that bonds people together -in 
relationships, 'and as the force that allows groups of people to take on . 
their own identity* Clearly, the process of communication is fundamen- 
tally iji5)ortant to any activity that requires more than a single person. 

As the number of people working together toward a common goal 'in- 
creases, as the con5)lexity of that goal increases, so does the in5)ortaiK:'e 
of communication increase: the more people there are, the more important 
it becomes to keep everyone infomed and the more in5)ortant it becomes to 
efficiently coordinate the behaviors of the people. * The more canplex the 
task, of the organization, the more in5)ortant that eyeryone know their ^ 
job and the more, important to efficient information flow becomes. 

Network Analysis is si way of studying the communication networks 
that develop in social systems as people communicate with each other. 
Specifically, it is a way of examining the i^^ole set,of relationships that 
exists in a fimctioning^ ongoing. systm. Network Analysis dllows us to 
make statements -about intact systems because it takes a systems approach 



to, the analysis situation: Jst focuses on the rel;itionships between the 
people in the system and looks at all the relationships at once^, without 
isolating the people from each other or from the relationships between 
each other. With Nett^ork Analysis we can not only see how the system as ^ 
a whole is structured but we can also see how each individual person fits 
in with the larger structure/ This becomes crucial v^ien very large systems 
are involved, as people may come to play specialized roles in the communi- 
cation networks of these systems, and as poorly organized networks may 
lead to very serious problems for the organization as a whole. 

. Network Analysis allows us to study the syston as it is , rather than 
as someone thinks it ought to be. This is because it uses data collected 
from people in the system i^ich describes the way they fit into the 
system as . the . system normally functions, rather than relying on organiza- 
tional charts v^ich tell how people ought to behave or how management 



'thinks" people .behave. Thus, the information provided by Network Analysis 
^s more valid than other kinds of infomation. 

''With Network Analysis, we can study the structure of the system. 
Large systems are almost always differentiated into smaller parts. These 
smaller parts may be groups of individuals who work together •on a ccinmon 
task or they may be groups of groups of individuals. Network Analysis 
identifies these groups and shows us how they are connected, either by 
direct' lijiks between members Of the different grot5)s or by links that g' 
through? specialize/* linkers*' -- people who function as "go-betweens" or 
'liaisons!' to connect the groqps. Network Analysis also gives detailed 
descriptions"of the communicati^ flows mthin the groups as well as flows 
between groups. For- example, we can see if some people ixx the group are 
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more central' or more critical, in terms of information flows within* the 
groiq), than "Others in the groiq). / , 

Since Netwefl^k Analysis is based on networks sets of nodes (peopl?) 
with links (comrnmication relationships) between thdm it gives us 
' topological information. -This kind of information may be contrasted with 
distance information v^ich is provided \)y mul'tidimei^sional scaling methpds 
or with Variance information which ^ provided by correlational methods. 
Since we are interested in topological properties of networks who talks 
to who, and so on we should use a fnethod that takes this approach. To 
be sure, we may later Be interested in other kinds of information^ and 
then we would use other td3iniques.» The Network Analysis method describe^ 
here is primarily topological. ^ 

Many other researchers have attempted to do Network Analysis in the 
past. These investigators have used a variety of analytic techniques, 
including sociograms, matrix multiplication or manipulation methods, and 
factor analysis. None of these methods are ideal: all are -slow and cum- 
bersome, some do not work at all, and the others d6 not work well. The ' 
comnon failure of these other approaches can be traced, among other things, 
to a lack of clear definitions and unambiguous goals. Indeed, this is the 
point at v^ich^ begin with a. clear set^of goals^and formal definitions. 
We use tohcepts that appear to be very similar to the concepts used- by the 
early sociologists groups, liaisons, and so on. However, our defini- 
tions are explicit and clear, and this will allow us to do much more than 
would be possible otherwise. The ^definitions we use are as follows. 

Wq begin with the smallest units of analysis nodes and links . In 
the case where we are examining a comnunication network in an organization. 



nodes would be people. Whenever af persoii. reports -a relationships vath 
another person, there would be a lini;' between the corresponding pair o£ 



nodes. 



We then Jivide up the nodes into two types - - participants and 
non-participantS' , Participants ^re the nodes that ta^e part in the ex- 
'change and transfer o£ 'information with other participants. The non- ' 
participants include all the^nodes having either no connection or only 
minimal conn^tion to participants. There are four kinds 'of non -partici- 
pants (all the roles are illustrated in Figure 1) . 
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Isol^;te Type 1 
Isolate Type 2 or 



Attached'^olate 



Isolated Dyad 



Tree Node 



These nodes haveTno links \diatsoever. 

These nodes have only a siRigle link 
and- thus cannot take part in the 
transfer o£ information through -the 
network,. They may, however, function 
as sources of information^ if they 
have links outside \the system* 

These nodes a'te similar to attached^ 
isolate pairs who are linked to each 
Other. In terms of contact with the< 
rest of the network, they function 
more like Isolates Type 1, 

x^^T^Ntl^ere is a subset of npdes with 
minunal connections (i.e., the number 
of links, ,* equals n-1, where n is 
the/ number of nodes in the subset), 
the subset will be a tree structure 
cojilflposed of isolates and tree nodes. 
The rauoval of any single links 
separ'ates the tree structure into 
rwo isolated parts. The isolates 
' mil be the nodes at the ends of the 
- st?lnctur6, and will have only single 
links. The tree nodes ajre the other 
nodes in the structure-. 
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Participants are nodes that have two^ or more links to other particJL- 

, ; __ . . V . 

pant nodes. They make up the bulkof the network in most cases and allow 

* ' J ' 

for the development' o£ structure. They include: 

A. Group meffiber^ A node with more than som*e percentage o£ his 
-"linkage with other members o£ the same group. (This percent is. 

called the alpha-percent or cK-percent'.) 

B. , Liaison, These nodes fail to meet the oc-percent criterion with 
meifibers of any single group but do .meet it for members of groups 
in general. 

C. Type other. These nodes fail to meet the cK-percent criterion 
* "tor toy "Set of group infers . 



\ * • To be called a group, a set of nodes must satisfy these five criteria. 



< • 



A. * There must be at least thrde members. 

*\ 

B. Each must moet the c><-percent criterion with, the other members 
of this group. ^ . ^ 

C. There must be some path, lying entirely within the 'group, from 
each member to each other ^members . (This is called the Connec- 
^iveness criterion.) 

ET. There may be no single node (or arbitrarily small set of nodes) 
which, when removed from the group, cause the rest of the group 
to fail to meet any of the above criteria. (This is called the 
critical node criterion.) 

E.. There must |je no single link (or subset of lijiks)' vsMch, if cut, 
causes the group to fail to meet any of the above ci^iteria. 
^ (This is called the critical link criterion.) 



The following points are relevant here. First, the that appears 
above is usually set to something greater than 501, such as 511 or 50.011, 
^ This is done to prevent ambiguous situations vftiich could occur otherwise. 
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Second/ several of the criteria refer to a pr oportion of a node's litikage. 
We are referring here to amount of linkage rather than number of lijiks . , 
If amount is operationalized as **time spent interacting/* we would look 
at the appropriate sets of links in terms of what fraction of the total 
amount of time they comprise, rather than in terms of how many links total 
there are. Tlius, a node having ten links could be a groi^) member even if 
only 2 out of those ten links* (20y were with members of the group, as 
long as those two links account for more than 511 of the total linkage. 
For example, those two links might take four hours per week. If the eight 
other links combined total less than four hours^per week, the nqcje would 
be a member of the groi^). ^ . " 

The goals of Network Analysis are to classify the nodes in the net- 
work into the various roles, based on their patterns of interaction with 
each other and to provide as much information as possibly about the system 
at each of three levels^-- the individual, the group; and the v^ole system. 

\ 

• 4. *• 



. - 'PART im' • ' • 
' y ' . ^ THE DATA FOR NEWORK ANALYSIS ' • 

•We said earlier that this approach is different from others because 

it 'focuses on the Relationships between "individuals , rather than on the 

individuals themselves..* Indeed, it is this focus on the relationships • 

« <H ' ' ' 

that allows us -to make statanents about the. 'system is a whole rather than 

p " ' ♦ ' ' 

merely about the people in the system. Because the- relationships play 

such a central role in ''the conceptual formulation, the way we handle them 

deserves a very careful examinatioa. This is the area we cover in this ^ 

section. * ^ ' • ' . ^ 

Relationships' . ^ ^ * 

We plight begia by asking \Aich aspects of relationships are iji5)ortant 
to us. There seems to be an almost infinite variety of kinds of relation- 
ships between persons in organization? or system9. If we are to make any 
progress at all, we must Teduce tliis infinite variety to a smaller, more 
useful ^et of dimensions.. Of course, this ias been done. For our pur- ^ 
poses, wfe only -need to look at a few aspects of relationships between per- 
sons. Che is the strength of the relationship. How much does it matter? 
^How of^en is the relationship)'' ''activated'^ or '•used''? Another is syimetri- 
cality — if J, am related to you, are 'you related, to me in the same way? 
Another is transitivity if I ^am related to you and you are related to 
Ha3h7',.does tfiat mean I am 'related to Harry? 



I£ we are working with ccMnunicatibn networks; the relevance o£ th,ese 
questions »becomes cl^ar. The strength o£ the relationship -might refer to 
how often we talk,, or to how much infomation He exchange. The relatioft- 
ship might be syhimetrical, v^ere'we share ideas'^ and exchange information, 
where" the influence is bi-directional; or ^ssymetrical, \>*iere I.wuld give 
you infomation or you would give me orders, where the influence goes in 
one- direction only. Siinilarly, if I pas5 on infomiation I receive from t. 
you to others, the relationship 'would be transitive, v^ile if I do not, 
if I keep it to ntyself , the relationship would be iat rails itive. 

Suyely there are other factors 'tl^t- are important. Perhap/ the most 
iji^ortant, ^t least in the case 'of, a conmunication relationship, is the^ 
content or function of the relationship/ In this way we might distinguish 
between formal, job-related communication and informal communication about 
matters not related to the job, or between caranunication about nesv ideas 
and comrnmication about errors or problems that are encountered in day-to- 
day work. ' ^ " . 

TTiese are the factors that' are usually considered to^be important by 
people who do a lot of network analysis. But these are all conceptual 
factors . They all refer to the relationship between the people in the - . 
system. , These Conceptual issues must be translated into operational pro- 
tedures so that we can build a model bf the system in the fom of data- 
This translation" is acconplished by our operationalizaf ion of concepts , 
and by' our measurement procedures. , Her^ we will create artificial con- 
structs v^ich wtf will use to represent- thfe real^world. 

For example, in Network 'Analysis wfe refer to people as nodes . The i 
node is our representation of the person. Similarly, we use ligks to 



represent relationships. The lihk is not the relationship; it ^dicates 
that there is a relationship between the people corresponding to the nodes 
it connects/ This may seem to be a useless philosophical complication, 
but it is iinportant to keep the distinction clear. Let us see vAiy. 

If we conceptualize the relation as symetrical, then whenevef A is 
related'to B, B should also be related* to A. This means, for example, • 
that if we are using the relationship "talks to" as symmetrical, if Uarry^ 
says he talks to Joe, Joe should also say he talks to Harry. But this )r 
might not be the case. If 'Joe does- not say he talks to Harry, one of two 
things might be happening. First, the relationship might not be;«ymmetri- 

.cal. Maybe Joe teally does not talk to Harry. Maybe Harry talks at Joe. 
Second, perhaps Joe surlily forgot that he talked to Harry, ^ybe he just 
made a mistake and the 'conceptualization of the relation* as synmetrical 
;?as accurate. ' ^ ^ . 

Thu^, we can use the correspondence between relationships and links 
to verify our conceptualization and to check on our measurement techniques. 
If a relationship is conceptualized^as synmetric, all links should be re- 
ciprocated. If this does not happen, we have to 'decide either to alter 
our conceptualization or place less, confidence in our measurement techr 
nique. If only a few links are'not reciprocated, we might assume the 
problopfi is measurement error and either delete unreciprocated links or 
add ^ '^missing halves.". If a large percentage of out links are unre- 
ciprocated, we should consider the possibility that either the relatioif- 

' ship is really not synmetrical or that there is a serious measurement 
problem v^ich is biasing our results in a direction* that makes things 
look as if they do not fit our conceptualization. In other words, there ' 
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may be a symmetrical ^relationship but our measurqnent process may ^ be 

getting at a separate, asymmetrical relationship. 

A second, set o£ considerations is relat^ to the strength of the ' - 

relationship. In order to use the conceptual^system we discussed earlier 

as a classification schofue for nodes, we have to have some indicator *of 

the strength of the relationships between them. When 'we are interested 

in COTTipj^cation networks, a logical indicator of strength is ^[the amount 

of infoimation exchanged or passed on from one- node to the other/* ^New, 

~ . "** • 

this is a difficult quantity .to mdasure. Wa might assume, for the sake 

of sifl^licity, that th^ mount ^ infoimation flowing is proportional to 

the length of time spent coiitnunicating, perhaps, or to* the frequency of 

iiiteraction. Our actual measurement procedure would then tap the duration 

or frequency of interaqtion. In this case, we would ask- the people in 

the organization to indicate v^o they talKwith and^ either how often dr. 

^or How long. 



4 

Ratio Level Scales 



. Not only^'does this indicator of the strength of the relationship have 
to -be a single number but it must also be a ratio-level indicator. That 
is, it must vary approximately as a ratio of the stcengtK^of 'the relation- 
ship. This is easily acconplished with the appropriate choice of coding 
sysTtems. If we ask^people how much time in minutes they s^end talking 
to the people they talk with, we will have a ratio-level indicator^. If ^ . 
we provide categories like: / 

* ■ Once a month or less ' 

'Once, or twice a week 
Once or twice a day 
Several tiines a day 
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we will have to assign nunbers to these categories in such a way that a 
ratio-level approximation is achieved. If we translate the categories 
into number of interactions per month, we might get: 




CODING 



Once a month or less = 1 

Once or twice a week = 8 

Once or twice a day = 27 

Several times a day * 64 



7^ 



r 



If we code a '^once a month" response as '*1'J, a "once or twice a week" as 
"8", and so on», we will. have an approximation to a ^atio scale. Rather 
than using the numbers -"1, 8, 27, 64," we might u^e tl;ieir cube roots 
(which happen to be 1, 2, 3, 4) and restore the original values at the 



»time of analysis. 



Combining Two or More Indicators 

We ihay decide that sisrople frequency or duration data are not good 
enough. For example, a very ijip^rtant exchange might be very infrequent 
or short, and we might want this to balance with exchanges that are indi- 
vidually less important but occur much more often. In this case, we 
might ask our respondents to indicate how important the information ex-r 
'changed was, in addition to how much or how often. We would then canbine 
the two scales into a single indicator of the strength of the relationship. 
Let us take an example to show Kow this is done. / Say we ask both frequency 
and iJnportance questions, as shown below; ^ . 
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Please indicate by circling the appropriate numbers 
]^ich people you talk. to, how often you t^lk to 
them, 'and how inpqrtant the interaction usually is. 
Use the coding system shown here. 



FREQUENCy 

1 = once/month 

2 = once/week 

3 = once/day 

4 = several times/day 



IMPORTANCE 

1 = slightly ijnportant 

2 = moderately in5)ortant 

3 = very ijnportant 

4 = crucial to survival 



NAMK 


, , FREQUENCY 


IMPORTANCE 


John Jones 


• 1 2 3 4 


1 2 3 4 . 


Emily Stuart 


12 3 4 


1 2 3 4' 


Tony Mann 


12 3 4 


12 3 4 


Belinda Humm 


12 3 4 


12 3 4 


Mark Smith 


1-234 


12 3 4 



Se)^ral/day 
Once/Jay 



Xe TOuld form a matrix where the rows are for' the different values 

of frequency 'and thie columns are for the different values of ijnportance, 
~f • ^ , • 

> ' » ' as shown here. 

.IMPORTANCE • 

We would theri/dtecide which 
entries should have the .highest; 
> and lowest values. Obviously, 
these would be the top right, entry 
and* the bott(fen left one in the 
example • 

, The next step is to assign 

the intermediate values. This is more diffioilt-. For example, how does 
the top left entry compare with the bottom right one? What about other 
entries?. If the values shown below are acceptable, the two scales can 
Bisnply be mul*tiplied together to give the^rfinal results • If ijnportance . 
had been coded in the reverse order, as shown below, the values for that 




Once/week . 
Once/month 
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scale would have to be reversed. In this case, they would be subtracted 

< » - 

from five to give Xl^ results shown here. . 




4 

3 
2 

1 rr? 
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The Matrix With 
Values Fillet In 



ORIGim 
CODE 


importanL 


REVERSED 
CODE 


,1 ' 


Crucial 


4 


2t 


Yery 


3 


3 


Moderately 


2 


4 ' 


Slightly 


1 



Reversing Scalers to* 
Obtain the Correct Orders 



In the exampltj discxissed above, we formed the strength indicator by 
taking the cross product o£ the ^o original scales. In other cases, we 
would use a linear combinationN^istead, For example, say we had separate 
scales for face-to-face and telephone interactions, as shown-^here. ^ 



.Please indicate how much time you 
spend talking to each person in 
an average week* (in .miciutes) - 





FACE-TO-FACB 


'TELEPHONE , 


Robert 


'■ r 




James 






Annie 






Frank 






Susan 







We might decide that face-to-face interactions are twice as important a_ 
telephone interactions because of the additional non-verbal infbrmation 
that is transmitted in face-to-face interactions. Then we would use this 
formula. for calculating the final strength indicator: 

♦ 

'^SJi.rength = 2*Face-to-Face + Telephone 

■ 19 



15 



The inportant point here' is that a single ratio-level indicator (or 
an approximation to one), must be available as an index of the strength 
o£ the relationship. A lot o£ trouble ca^ be saved by constructing instru- 
ments so that they can be easily coded to give ratio-level data. If .this * 

I 

IS not done, the data must be transformed to give ratio data at the timq 
o£ analysis, i£ that is possible. 



Otber Instrumentation Considerations ^ 

In the discussions above we have seen several examples of instruments 
^ that might be used to collect network data,' They are' all variations pf 
the same basic dfesign. Sme types 'seem to work b'etter than others in 
different situations. For exan^ile, there are two ways of getting the 
resp^dent to provide the names of the people he or she is linked to. 
The first works well when there are. less thaii about two or three hundred 
people in the organization. With this method, a list of all the people 
is provided and the respondent simply fills in the appropriate spots on - 
tlie' instrument. An examplp of this type is shown in below. 



How often do you interact -with 
the people named here? Please 
indicate the approximate 
number of interactions per 
week for both job-related 
conversations and other con- 
versations. ' 



In the column on the left^ ^ 
please write the names of ^ 
people you talk to. In thfe 
other columns please indi- 
^ cate how many times you talkr 
* to these people in a "typical 
week. Do this for both job- ^ 
related conversations and. 





nthftr roTivftrsflf inn<? . 


NAME , 


JOB-RE tATED 


OTTIER 








Sam 












Mary 












Bill 













4 
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In the first type, the respondent only* has to recognize the name of 
the person he or she is linked to. In the second type as shown in B, the 
respondent is asked to recall the names. The second type is appropriate 
, for ver>' large organizations, where it wx)uld be impractical to provide a 
list of all the names because' of the (Aength of such a list, or for systems 
i^ere.all the names of i;elevant people are not known. 

There. i,s likely to be a difference in the number of contacts reported 
on the two types of instruments. Specifically, since it is easier to 
recognize a name on a list than to recall a, name from memory because the 
lis^t of names serves as a prompter, there will probably be more contacts 
reported with the first method than with the second. Ifnfortunately , this 
difference has not been tested empirically so no definite statements can 



— be made regarding the trade-off. ^ ^ 

A -second way in which instruments may vary is in the method of coding 
the strength of interactions. variety of approaches has been used here: 

(a) interaction frequency may be coded 'into .categories as shown in '*A\* 

-below; (b) interaction frequencies may be coded directly, as sho^vn in *'B**; 

(c) inteTa^ction duration may be coded into categories, as shown in 'V; 

(d) interaction duration/riay be "coded directly, as shown in 
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FREQUENCY 


DURATIOM 




r. 


Once/month - * 


' How many 


, 1. 


Less thaj\»5 mins. 


How much 




2. 


Once/week 


times in 


, 2. 


Less than lO^mins. 


time in 




3. 


Once/day 


the last 


3. 


Less than 20 mins. 


the last 




'4. 


Several /day 


week? 




Less than 3P mins. . 


week? 










5. 


More than 30 mins. 






A 


B . 


C 

, - — ' . 


D 
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' From a theoretical perspective, it would seem that the method shown 
in/'D" above would provide the most valid information. Filowever, it is 
harder to estimate durations o£ interactions than frequencies of interac- 
tion, as in "A" and "B", and it is harder to estijuate precise numbers 
than, simple categories, as in 'W' and Thus, the rtetiod shown in "A" 

is probably the easiest for subjects to' use, vMle the ond^ in 'D'* provides 
the best information. Again, there have b^^n r]Q empirical; studies con$)ar- 
ing the alternative methods. ' ' * ; ' 

When several content areas are to be used at once, it is not neces- 
sary to have a separate instrument for each one.- Instead, they can be 
combined into a single form, with multiple columns for the different -con- . 

tent areas. An example of this is shown below, where three separate 
♦ 

content areas are being measured at once. In analysis, these will be 
treated as three separate networks which might later be con5)ared dnd ex- 
amine for similarities or differences. 



Please indicate how often you talk to the following 
people about each of the three topic areas. Use this 
system for coding "your I'esponseS . 

1. Once a month 

2 . Once a .week . 

3. Once a day * 

4. •several times a day 



NAME 


PRODUCTION: 
GETTING MY 
•JOB DONE, V 
DAY-TO-DAY 
MATTERS 


INNOVATION: 
NEW IDEAS OR 
mYS OF DOING 
THINGS 

f 


SOCIAL RELATIONS: 
INFORMAL FRIENDSHIP 
CONVERSATIONS, ETC. 


Harry 






^ 


Timothy 








Maude 




^ 




Jenny 








Donald 








f^Iidiael 
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Direct ionality/Symmetry 

We menticmed synmietricality as one dimension among which relation- 
ships may vary. In order to be consistent, our instnlments shotild re* 
fleet pur conceptualization q£ the relationship we are interested in. \ 
In this case, we should .asl^ a question that elicits non-directional 
responses if "we have a symmetrical relationship, and so on. This is es- 
pecially .important when we are interested in asymmetrical relationships, 
for vhich we would expect the direction of the relations!^ to ^be 
important. For exajiple, if we are interested in infotmation flows, 
specifically in the direction of flow, we might use a question like: 
"Please indicate v^iich of the following people you received iafonnation 
from in the last week" or "Please indicate how often each of the following 
people ccme to you for infor^lation." On ^je other hand, v^en we do not 
want directional infonnation, we should i)e careful not to use instruments 
which elicit this information, since the result Vill be a distorted yer- 
sion of the network we are really irfCereste 



The 'Actual Collection of Data, 



Compared to face-to-face interviewing methods ..jAQ^collectibn of 
network data is relatively fa^t and easy. RespdHdents cak-bevassmbled 
in large groups, by the team of investigators who then explain the natul^e 
of the stud|, insure 'confidentiality, and disCuss the instrument. Once 
the instrument is distributed, there will be questions about the "meaning 
of content categories, what to do if a person's name do6s not appear on 
the list, and so on. At any rate, the time required to fill in the instru- 
menf wijl seldom be more than an hour. 
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There are several very, ijnportant pointb to be made here. First, it 
is absolutely essential tha\ respondents indicate v^o^they are. This can, 
be easily done by asking thern to circle their own name on the list or to 
write it. at the top o£ th^ pa^e. I£ thiS is not done, the data are useless. 

, Second, for a valid Network Analysis to be done, the entire system \ 
must be .ceosused.* Every person in the system should fill out an^instru- 
•ment. A^failure rate of more tlaan five' to ten percent will greatly re- 
^^ce^Xhe validity of this technique. Thus, if some respondents miss the " 
data collection session or negledt to turn in the Qon?)leted instruments, 

it is necessary to have a member ^f the investigating team locate those 

1 

individuals and obtain the data possible. If this is not possible, 

* / ' \ > 

a list of missing persons should be con5)iled and used in the interpreta- 
tion of the final results. If the jsystem'^ a v^iole is too large but may 
be broken down into ^ller divisioiis or subsystems, a san^Dle of these 

subsystems may be analyzed, where tjiere is complete\censusing vjlthin each 

j • \ ^ ^ 

intact subsystm. This situation ijs <iie to the nature of Network Analysis: 

■ ^ » I 
cthe unit 'of analysis is the system^ rathe? than the individual. 

Third, respondents should be encouraged to indicate how they really 
behave, rather than how they think they ought to behave. It helps if the 
study is introduced as a diagnostic aid "to see that people get the infor- 
mation they need to do their ^jobs,'' and, if the confidentiality of the 

study is en^hasized: ''No one from the Canpany will see these 

forms. We (t^e An&lys-is team) will take them back with us to our univer- 
sity, where we ^rsonally will do the analysis." 

Fourth, respondents mu^t be given unique subject numbers* 'running- 
from Kto^^, v^iere N is the humbelr of respondents in the systm. If these 
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numbers are^printed on the data collection instroments next to the names 
o£ respondents, coding and punching is greatly f^ilitated. Thi5 is not 
possible when open-ended instruments (wherethe respondent must recall the 
names o£ individuals he or she talks with) are used; - In these situations, 
the respondent numbers must 'be obtained from a list o£ names and added at 
the .time of coding. There axe serious difficulties with this method, as - 
people cannot recall full names of individuals they talk with, or *as they 
write names incorrectly or illegibly. For these reasons, the other foimat 
for instruments is highly recommended. 
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PART THREE ■ • ' 

> ■ . , 

'IHE -ANALYSIS 
A . > 

0 

Introduction ' 

So' far we have seen what Network Analysis does ajid what kind of data 
it uses. In this sectioi).^ we will see how the actual analysis is accom- 
plished. 'Ihe main tool we have is Negopy/ the. Network Analysis Pjrogram.* 
In tHe first part ff this section^ we will discuss the algorithm upon 
which the program is based, because an understanding of the kinds of things 
the program does i§ useful to any potential user of the program. In the 
second part, we will take 15) several considerations directly related to 
the 1975 CDC version of the program, such as limits on the data, specific 
requirements, and so on. We will also ^discuss the various options the 
user has when running the program and the output of the program vrfiat 
the. various tables mean and hdw to interpret them. The 'third part contains 
detailed information on using the program how to prepare control cards, 
and' so on. In Part Four, several miscellaneous issues 'are discussed. 
"Xhese include error messages- and how to interpret and correct than, how 
to *4in$ tune*' the program, and a section on known ''bugs** in the program. 

^ The point was' made earlier that Network Analysis is a topological 
method --it looks -for specific patterns in the data. Hie realization 
that thi$ is a pattf^m recognition problem made it possible to program 4 
computer to do tlie analysis. The Network Analysis Program, th^n, is based 
''j© Copyright 1975, IVDR 
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on a pattern recognition algorithm. Although it uses a variety o£ st^tis- 
tical and mathematical operations as it carries out an analysis, it is not 
based on mathematical or statistical procedures,' as are other kinds of* 
analytic programs. ^ ' • ' ^ \ 

There^e five stages in the analysis. In ttie*' first stage, the data 
are read in and cleaned and organized in an orderly fashion. Then an^ 
iterative operation that makes the actual pattern recognition part possible 
Is perfomed in the sec9nd stage. In the third stage, the pattern, recog- 
nition algorithm is carrie<^ out. Here, groupb are tentatively identified. 
In the fourth stage tlie strict criteria for the various role definitions 
are applied and the tentative solution produced. earlier is tested and made 
exact. The results of the analysis are printed out in the ?orm of various, 
tables and charts in the fifth and final- stage. In the present part of 
the discussion, we will only cover the second, third, and fourth stages 
of the analysis the parts wiiere groups are identified and the formal 
criteria are applied. ' ' 

I, THE ALGORITIM IVHICH JDENTIFIES GROUPS , . 

The major task. to be accomplished in this part of the analysis is to 
identify the groups. We have data describing the relationshig^s between 
individuals. If we can represent 'the data in the right way, it will be 
easy to see^the groups. The representation we would like would be one L. 
which the members of each group arD close to other members of .the same 
grbupjand far from the members of other groups^.' Then we could just look 
for clusters groups sets of nodes liaving most of their linkage to 
other nodes in the same groups'. This will be a graphical representation 
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of the data nodes will be moved around until their J^ations, relative 
to other nodes, cafi>bc used to decide the way they fit into the netvv'ork. 
The first step' is^hus to rearrange the data so- that 'the groups be- 
come visible, and the second to identify the grjj\^. The v^y the first 
st^ is accomplished can be understood easily with the following analogy. 
Imagine the nodes to be like billiard balls scattered about in .space. 
Imagine there to be rubber bands connecting the balls corresponding to 
nodes with links between them. Imagine there to be springes between balls 
corresponding to nodes that do not have links between them. The lubber . 
bands^ will act to pull the ball5 connected to each other closer to each ^ 
o^^r;'wiiile the springs will push th^ balls, not connected to each other 
• apart from each other. If we hook up 'the rubber bands and springs and 
'release the balls, they m.!! rearrange themselves So that the balls cor- 
responding to nodes with links to each other- will be close to eacli other, 
while the balls corresponding to that, are not Imked to each other 

will be pushed away froju each other. fThis example is shown in Figure 3. 

We could refine this technique by using heavier rubber bands to repre- 
sent the linkb that oca;r more often ^or are more important. .Since our 
objeotive here is to juake it easier to identify groups, we could make the 
process v/ork even bettei if we could make tiie rubber bai^ds for within-group 
links heavier than the bnes for other kinds of links. In order to do this, 
we will need some indicator that tells us which links look -like within-gjt^oi^) 
linkb. 

If two nodes are in tlxe same .group, they are likely to have many 
links to the same people. There is likely to be a high number 'of 'shared 
links, or Lwu-step linkb bt^tween tlds.pair of nodes. If* they are noL in 
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. " FIGURE 3 

rnj^ figure illustrates tlie billard ball and rubber band model de- 
scribed in the text. The net:vvx)rk shown* has two groups of three nodes 
each. The three drav^ings represent three succe^ive increments 
as the nodes move farther and farther in response to the forces exerted 
by the rubber bands. • • 

The original position of the balls is shown by the shaded circles in 
the top dri|dng;. Movanent of balls during each time increment is shown 

i i 

by *}ie dotted arrows in the three drawings. The scale .was chan^ in 
'going frtfm the first to the second to the third drawing in order to^'show 
smgller and smaller regions in space as occupying the same sized' area in 
the drawings. The region pf the top drawing shown in the^middle one is 
indicated by the dotted box in the top.- Similarly, the 'area- of the bottom 

r 

drawing is shown by the dotted box in tlie liiiddle one. 
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WuMKFueo Circles.- 
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the same giuup, they are not likely to talk to die same people, and there 
are not likely to be many two-step links between the nodes. Thus, tlie 
number of two-step links is used as an indicator o£ the probabilit)' tJiat 
the link JkS a within-grouqp linl^ 

Now, it is hard to represent large* numbers of points in multi-dmensional 
space. It takes a lot of information to.^do this, and it is fairly diffi- 
cult to move objects, in this kind of a space. Extensive experimentation^ » 
with, real data, however, showed that it .was not necessary to' use a multi- 
dimensional representation for this analysis;^ a single line .i^egment was 
sufficient. This kind of reduction in complexity of representation 
'greatly reduced the amount of infomation needed to perfoim the analysis 
at the same time it made the analysis itself easier to do. 

The analysis is performed as follows: nodes are scattered at unit • 
points along a line segment N units long, where N is the number of nodes. 
We then treat each link from, say, node A to node B, as a vector, starting 
at A and pointing at B. We take Ml the vectors 'for each -person and com- 
pute the average, weighting the individual vectors for strength of the 
link and probability that the lin£ is a within-group link. We then get 
a single point, for each individual^ that. point being the mean of that 
person's vectors. This is illustrated in Figure 4. After all the means 
have beeA computed , each node is moved to the point indicated by his or 
•her mean. - _ , , • 

After this process -has been completed, nodes with links to each 
other will be closer to each other than tliey were before. They will not, 
however, be as close as they could be. This 'fact is due to the way nodes 
are scattered initially, and also because of the statistical properties 
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FIGURE 4 ff' 



At the top of tliib figure is shown a hypotlietical network consisting 

) ' 

of two groups , eadi of widen has three members . 

.e diagrram in the middle shows ho\v the six nodes are initially 

placed along a line, segment. The two, solid arrows poLnting to the right 

•in the top of this figure are the vectors representing the links of 

Nodef #1 ta Node #2 and Node #6. The dashed arrow between the' solid ones 
/ J 

is the average of the two. Below the line segment ar^ shown the vectors 
foi/ the links of Node #6. . ' " 

The diagram on the bottan of Figure 4 shows how the iterative pro- 
cess of vector averaging works. The first line shows the initial posi- 
tions of the six nodes. The second shows what the nteans could look like, 
ving from the second to the thir.d lines , the scale has been expanded so 
at the nodes range over the entire length of the continuum. The fourth 
and sixth lines show the second and third. sets of means, while the expanded 
versions are shown on the fifth and seventh lines. (Note that the values 
shown are, not the actual values^ that would be obtained for this particular 
network; they are intended merely to illustrate how the process might 
typically look.) . . 
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o£ the mean*. For this reason, the entire process is repeated, using the 
new locations instead of the original positions used for the first set of 
calculations. A plot showing how the nodes moved in successive iterations 
*is shown in the bottom half of Figure 4. Between each set of calculations 
it is necessary to expand the scale of the^ continuum so 'that the spread or 
range which is occupied by the nodes remains N units long. If this is not 
done, the points will move closer arid closer to each other, finally col- 
lapsing on a single spot. This is the "scale expansion'' referred to in 
Figure 4. > " ^ 

The formula used for calcuiating a person's mean is shown here: 

:SCwfi>Si>Mi) ' ^ ' . 

M' = Stwl'J-Si)- , 

where wf^ is the two-step weighting factor, described above; Sj[ is a ratio- 
level indicator of the strength of the link; and is the old mean of the 
perscJn to v^an the link goes. The. sunimation is done as-i goes frcm 1 to 
H, where' ^ is the number of links that the individual whose mean we are 
calculating has. - - ' ' ' 

In tlie develQpment of this algorithm, different nunbers of iterations, 

different ways of varying "relative contributions of wf^'s, Sj['s, and Mj['s, 



and different ways of assigning the original Mi's were tried. In general, 
^ four to six iterations seemed to be sufficient for any data set that was 

examined. If nodes are given subject numbers running fr-um 1 tp N, where N 

is the number of nodes, and these subject numbers are used as the first 
- approximation fpr the Mj['s, the process seems to work well for all types 

of data. In actual tests, when different subject rpnbers were assigned 
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to individuals, the solution obtained was identical to the first solution, 
which indicates that the process is not terribly sensitive to the original 
positions. Usually, the wfi's and Si's are given ejqual weight, although 
tliis has not been te,sted extensively. 

The result of the application of this process is a continuum, N units 

* 4 • 

I 

long, 'with a scattering of nodes along its length. A sas^le network, 
together with the continuunr that might result, is shown "in Figure. 5, 
This continuum is used as the input to the next stage x)f the analysis, 
in which tentative boundaries for groups are drawn. 

Drawing the Tentative Boundaries ' 



For any human observer, even a casual glance at Figure 5 will be 
enough to suggest that there are three clusters of nodes- The computer, 
however, must be told i^^at a cluster looks like, and how to look for one. 
People probably identify a cluster as an area in \Aich there are a lot of 
nodes, surrounded by areas in vSiich there are fewer nodes., .This is essen- 
^ tially what we have the machine look for. ' ^ 

^ • We will need a plot of Uo^ "density'** of nodes along the cfontinuum* 
o In order to get such a plot, we construct a 'Vindow" and move it along the 
continuum, counting the number of nodes visible through the window at 
each point. This is shown at the top of Figure 6, The optiimjin size of the 
window, determined by experimentation, appears to be about two units on an 
N unit line.* Windows smaller than this introduce spurious statistical 
information, vMle with windows larger than this, group boundaries tend 
to blur and merge into indistinct ion. This is shown in Figure 6, vrfiere 
density plots appear for windows" of varying widths. The result of moving 
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FIGURE 5 

The top of this figure, shows a hypothetical network composed of twenty 
nodes . Group boundaries are indicated by the dashed lines. 

The bottom shows ^Avat the final continuun might look like for the 
network shown ia the top. Again, the group boundaries have been indicated 
by dashed lines. 
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FIGURE 6 , 

This figure shows how the density plot is made. Hie example uses 
the continuum shown in Figure 5. In the top part, the window is shown, 
centered successively on the first eight nodes, :.. 

The three bar graphs in the middle show the effects of differently 
sized windows. 

On the bottom is shown the refined version of the plot, with numbers 
of nodes visible to the right of the center of the window plotted above 
the horizontal and numbers visible on the left of the window plotted below 
the horizontal. 
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the window down the continuum will be a list of densities, with one value 
^£or each individual. Such a list could be represented as a bar plot like 
the one shown in Figure. 6, 

With this representation, groups will look like mounds, with boundaries 
betweeji groups being indicated by low puints. Although it seems as though 
this representation would be adequate, there arose problems \^ch lead to 
an ijiiprovement over this simple, plot. Although the problaiis will not be . 
discussed here, the improvement will: instead of just counting the number 
of nodes visible tlirougji the window^, two numbers are counted the number 
vibibie on the right half of the window, and the number visible on the 
left half, '//hen const i^lc ting the bar graph, ,the number visible on the 
^right halL is plotted above the ho i;izontal, While the number visible on 
the left half is plotted- Delow the iiorizontal. The result is .shown at 
tlie bottom of Figure 6. 

Ihe final step in this stage is to have the computer draw lines arounci 
tJie groups. , Tni.s is done by locating spots at whicii theie is a. large 
change as ,we move, from one point on the continuum to the next. If we 
cound the number of non-overlapping points and divide by the number of 
overlapping pointSL for each pair of adjacent nodes on the final bar plot, 
we will have a fairly sensitive indicator of group, continuity. This is 
shown in Figure 7. High values for this ratio will indicate that there is 
a large change as we move front one node to the next. Low values, on the 
other hand, wil] ijidicate that there is only a small change. If we choose 
a cutting point and instinct the computer to draw a line whenever the ratio 
goes above the .aitting point, we will I'lave told the computer how to draw 
the boundaries arourid graops. If the value of the cutting point is variable, 

* . L. 
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FIGURE 7 , ' . - 

This figure Illustrates^ the boundarx-drawing process. The density 
plot on the botton of Figure 6 is shown on the top of this figure. The 
table belov the plot shows the number of overlapping points, the number 
of non-overlapping poijits, and the ratio of the two numbers for each 
successive pair of bars on the bar plot. 

The ratios are plotted in the graph in the middle of the page, ^e 
.three dotted lines show the three different cutting points. 

Below the ratio plot, the original continuum is shown three tiines. 
'The first shows the effect of a high cutting point, v^iile the second and 
third ones show the results* for moderate and low values of the cutting 
point . 
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we can alter the sensitivity o£ the group spotting routine in either direc- 
tion. With a window o£ two units, a cutting point o£ 1.0 appears to be 
optimun for most net^vorks*. Different values, along with the results, are^ 
^hcjwn in Fi^re 7. - • . 

This concludes the approximate phase of the analysis. The result of 
this stage is a list of tentative groi5)s of nodes. The next part of the 
analysis involves the testing of this tentative solution and any alteration 
that nay have to be dcme to ''clean it iq).'' 

Using the Criteria for an Exact S ol ution 

This part of the an^ysis can be divided into two parts. In the firs.t 
individual nodes are tested to see if they meet' the relevant criteria for 
their role in the network. If they db not, the appropriate changes are 
made. In the second, whole groups are tested for the criteria that are 
relevant at that level. Again, appropriate changes are made if necessary. 
We begin with the individual testing, which is very simple. 

Individual T e sting 

First, people not in groups are tested to see if they meet the 
«^-criterion for either liaison or group membership in any grcfup. If. any 
individual does meet the criterion, he or she is reclassified on that 
basis. If the individual fails both tests, he or she* is labelled as "typ^ 
other." 

Second, members of groups are tested to' see if they meet the 
criterion for group membership. Again, if the criterion is not met 
the appropriate changes are made. 
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Because changes made at any point in tme can affect the roles of 
other people \^o were tested earlier, the tests are applied twice to mak^ 
sure that the final classification will be consistent vdth itself. 

Groiq) Testing 

In this section, we change our level of analysis to whole groups, 
rather th^n separate individuals. The cfiteria to be tested in this part 

are the connectiveness and critical link/node criteria. Since the infor- 

*■ * 

mation generatedrin the testing of the connectiveness criterion is neces- 
sary in the testing of the other two, it will* be covered first. 

The basic device us'ed in the testing of these criteria is the 
distance matrix, which is constructed for each group In this n-by-n- 
matrix (n is the number of members in the groi^) , the element in row i, 
column j gives the number of steps needed to get from individual i to 
in4ivldual j in the group. If there is some finite number in each element 
of the matrix, the group will be connected. This means that there Vill' , 
be some path from each individual in the group to every other individual 
in the group. The longest any path could' ever be is n-1 steps... A sample 
network, together with its distance matrix, is shown in Figure 8. 

The distance matrix is constructed as follows . A matrix is constructed 
in v^ich there' is a row and. a column for each node in the group. All the 
elanents are initialized to zero. Whenever there ds a link from node i 
to node j, a "1" is entered in row i, column j. If the liftk is recipro- 
cated, a ^T' is also entered in row j, column i. ' 

boolean logic operation vrtiich*is analogous to raisirtg the matrix * 
to successively higher and higher powers is then perfomed. Instead of 
setting the ij element in the product matrix to the value of the cross 
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FIGURE 8 . ■ ■ • 

* 

At the -top o£ Figure 8 is shown a hypothetical eight -node network,^ 
The matrix directly below the network is a binary version of the network. 
In this matrix, each node has a row and a column, ^The i,j entry of the 
matrix is l^if node i is linked to node j. 

The Second matrix is the distance matrix for the same network, ITie 
entry in the i,j element of the matrix is the number of links in the 
shortest path f ran node i to node j • . 
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product ©f the ith row and the jth column, however, the first power on 
which this value becones non-zero is ^ed. 

. Hie raisipg of the matrix to higher powers is stopped when one of -two 
conditions obtains either: (a) all off-diagonal eleinents become non-zero, 
vduch implies the groiq) is connected; or (h) when going from any power k ^ 
to the next, power k+1, no entries change value, which ijT5)lies the group 
is not connected at level k and will never be connected at any level. 

. If the group is not cOTnected, it is split into a connected part and 
all the rest. Each of the two parts is then treated as a separate group, 
axid subjected to all the tests that any group must undergo. 

At this poii^t, there are only the critical links/nodes criteria re- 
maining to be tested. These criteria serve as checks against situations 
'like those shown in the bottom half of Figure 9, where two groups have 
been mistak^y identified as one. This 'situation is generalized to in- 
clude situations in vibich there are any number of multiple groiq)s , con- 
nected in some relatively minimi way, which we wish to separate into * 
distinct groups. The occurrence of these confusions is a result of the 
inelegance of the approxijnate techniques used in the first half of the 
analysis. For analytic purposes, it -Is practical to combine these two 
criteria info a siiigle rule which says that no subset of some arbitrary 
size'may be removed from a group" and cause the groi^ to hecme disconnected. 
If there is such a si^set, the groixp will be seen to be "really" tv^ro or 
jnofe groins. As a result of this combination, \rfienever . two groiq)s are 
joined by a bridge link link between members o£ different .groups) , one 
. of the nodes of this link will be identified as a liaiison. That node will 
later be tested for the « -criterion of group, membership and jLf it passes, 
.will be returned to the group. 
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FIGURE 9 - 

On the i5)per left-hand corner o£ this figure is showi a hypothetical 
nine-meraber network. To the right of this is the distance matrix for 
that network. The rightmost column of the matrix contains tKe means of 
the rows of the matrix'. The values in this column are thus the mean num- 
ber of steps it takes that node to reach all other nodes. The overall 
mean for the group, together with tfie standard deviation of the distribu- 
tion of means , is shown below^ the matrix.* * - • 

The network in the bottom left-hand corner is an example of the kind 
o£ situation that occurs wlien two or more groqjs are identified as a single 
grotp. Clearly, Node #5 is a' liaison between the two groups. The middle 
matrix on the right half of the page is the distance matrix for this group. 
Note. the relatively high standard deviation for this groi^), ccnpared to 
the one above it. 

The third matrix was constructed dfter rmoving Node #S. Note that 
there are no values for m*any of the elements, indicating that the groi^) is 
no longer connecte^\^jnie -means, shown for this bottom matrix are tl^ values 
that would be obtained if the giroup were split in two, and the means for 
each group calculated separately.^ 
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The problem has thus been reduced to one o£ identifying any critical 



nodes vrfuch may exist in a group. I£ there is one, it will be the node 
with the Imest average distance from all other nodes. This is because 



go through the critical node. The average distarfcefrom any node to all 
the other nodes is given by the average of all the entries in that node's 
row an tlie distance matrix. Tnis is illustrated in Figui'e 9. If there 
is a set of critical nodes, they will be the nodes with the smallest 
row means. 

The fact that critical nodes have lower row means than the other 
mcanbers suggests that there must be some variation in the row meanb if 
there are any critical nodes. We can take advantage of this fact if we 
only look for critical nodes \^ea there is some variance. It turns out 
that' this leads to a large savings in terms of computation time. This is 
because of the way we test for critical nodes. 

To check, a node to see if it is critical, we "^remove it irm the groiip 
and re-calculate the distance matrix. If, as a result of the removal, 
the group becomes disconnected, we have found a critical node. If the 
group is still 'connected, we try the next candidate --the node who, of 
all the remaining nodes, has the smallest row mean. We will stop this 
process after taking out sane ]^ercentage of the original group mmbers ' 
(usually 10% is enough to ''catch*' all the dritical nodes) if the group 
continues to reimain connected. If this happens, .we put all the removed 
nodes back into the group. 

It is easy to see that there is a lot of work involved in the searching 
for critical nodes. This is why the heuristic device of checking the 



all paths from nodes in either half of the group to the other half must 




ERIC 



3. 



45 



The problon has thus been reduced to one of identifying any critical 
nodes which may exist in a group. If there is one, it will be the node 
with the lowest average distance from ail other nodes. This is because 
all paths from nodes in either half of the group to the other half must 
go through the critical node. The average distaricTyrom any node to all 
the other nodes is given by the average of all the entries in that node's 
row m tiie distance matrix:. Tnis is illustrated in Figui-e 9. If there 
is a set of critical nodes, they will be the nodes witli the smallest 
row means. 

The fact that critical nodes have lower row means than the other 
members suggests that there must be some variation in the row meanb if 
there are any critical nodes. We can take advantage of this fact if we 
only look for critical nodes ^vhen there is some variance. It turns out 
that' this leads to a large savings in temis of computation time. This is 
because of the way we test for critical nodes. 

To check, a node to see if it is critical, we '^remove it from the group 
and re-calculate the distance matrix. If , as a result of the removal, 
the group becomes disconnected, we have found a critical node. If the 
group is still ^connected, we try the next candidate --the node \rfio, of 
all the remaining nodes, has the smallest row mean. We will stop this 
proce3s after taking out sane percentage of the original group members * 
(usually 10% is enough to ''catch*' all the dritical nodes) 'if the group 
continues to reijnain connected. If this happens, .we put all the removed 
nodes back jjito the group. 

It is easy to see that there is a lot of work involved in the searching 
for critical nodes. This is why the heuristic device) of checking the 
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variance o£ the row means is so important. In eveiy network that has been 
' examined so far, this heuristic has worked correctly. That is, it did not 
prevent any crity:al nodes from being folmd- Similarly, the approach of 
looking at nt)des with the lowest row means always finds the critical nodes* 
The optimum value to use as a cutting point for the variance test seems to 
be about 0.3. Whenever the standard deviation of the row means exc^^ 
. this value, ther©^ is likely to be a, critical node". Whenever the standard 
^ deviation is less than this value, there is not. 

After all groiq)s have passed. these tests, the obtained classification 
of nodes to groins and other roles will be exict.^^ At this point, various 
indices may be calculated and the results tabled in any convenient manner. 
A flow chart of the algorithm is shown in Figure 10. 

II.' NEGOP Y: THE NEIWORK ANALYSIS PROGRAM . • 

In this section, Negopy, the Network Analysis Program, is discussed. 
Since there are five parts in the analysis, the discussion is diArided into 
fiv e parts: describing the data, preparing the ciata for group de|;e^ion, 
initial group detection, applying the formal criteria, and printing the 
.results. In eacli part we will briefly revie^w the relevant parts of the 
algorithm, discuss the parameters by. which the user may control the com- 
puter (th^ere are 45 parameters which control the operation of the network 
program, much as the knobs on\ radio c^trol the way" it works) , and de- 
scribe the output produced by this part of the analysis. 
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* 

1, Data Description " ' ' 

A> Review 

In this stage, the computer will read in the data according to format 
statements provided by the user. Any transformations needed to approxi- 
mate ratio scaling of link strength is done here , ^ *" ( 

B> Parameters ' 

POl # OF NODES DEFAULT = 0 

MAX = 4095 

^ f 

The first parameter that is specified by the tiser is POl. It tells 
the computer how many nodes there are in the network. It also tells the 
con?)uter what the highest legal subject number is. In other words, the 
computer expects that nodes are numbered from 1 to N, where N is the 
number of nodes. If this is the case, N will be the hi^ghest legal. subject 
ninnber. I£ tMs is not the case, i£ subjects are not numbered 1 to N, the 
value for this parameter must be ^et to the largest legal subject number. 
For ^cample, say there are 100 nodes, numbered £ran 101 to 200. Even' 
though there are only 100 nodes, POl must be set to 200.^ In this case, 
the computer will assume that there are 200 nodes. Naturally, nodes 1 to 
100 will have no links. They will be identified as isolates type one. - If 
POl is set to 100, all links to or fr<M nodes having jiumbers greater than 
100 will be rejected, * ' - :' 

This is in5)ortant: all links to or from.nodes -having numbers greater 
than the value of POl will be rejected by the cQmputeiM 

P02 " #^0F LINKS ' DEFAULT =. 0 ' 

• . MAX = 3 2,767> • 

This is an estimate of the number of links. It is used by the com- 
puter to allocate memory space . This parameter should always be set to 
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a value about 200 liigher than the number o£ expected links • I£ the actual 
number o£ links exceeds this number, only the first P02 vdll be read'. 
The excess links will be ipiore^. 

P03 RECIPROCAT ' DEFAULT = 1 

I£ this parameter is set to, zero, the conputer wil.l drop all unreci- 
procated links, I£ it is- set to one,' the coffiq^uter will force reciproca- 
tion by adding the '^missing. halves'' of all unreciprocated lijiks. In 
*^%his case, the ccnq^uter will carry infonnation \duch tells if the link 
'is reciprocated, unreciprocated,^ or. ''added/' This infoimation is printed 
at the appropriate times and will be discussed in the sections on how to 
interpret the output • ^ 

. ''^ P04 — DIRECTION DEFAULT = 0 

If this parameter is left at the default valued of zero, the computer 
assumes links are non-directed. That* is, if th'ere is a link from A.to B, 
there is also one from to A. (l^inks are always "from" the person, re- 
porting .the linjc.) If the parameter is set to. one, the computer assumes ^ 
'links are directed: ^a link from A to B does not Ju"Ply ^ lif* ^^om R to A* 
P04 is related to P03. I£ P03 is set to 0 (dropping non-reciprocated 
links), P04 should be l^t.at the default value o£ 0, since all links are 
reciprocated and, thus, bi-directional. If P03 is set to onei, however, 
P04 can be set to either zero or one, , If P04 is set to the default value 
of 0, the cosnqjuter will assume that unreciprocated links are evidence of 
measurement error and will^add the missing halves. If P04 is set to one. 
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however.,' the coniputer will not make this assunption. Although it will 
still add links to force reciprocation, ill will not use the ''added'' links 
in calculations where this would violate rhe ass.umption o£ directionality-.. 
In 'this case, the conpiiter will either print only a single valud appro - 
priate to the^ directed of else print twOw\alues the first ^propriate 
to non-directed links arid the second to directed links. I^e cQn5)uter will'" 



always print' infonhation regarding this de 



P07*^" DATA UN 



:ision. 



DEFAULT = 40 



If this parameter is left at thei default value of 40, the computer 

^ • \ ^ 

will attempt to read the data frm a file \OTl^ch has the local name "data," i 

If data are on a peimanent file, the £il%^^!^M..be attathed with the local 

name "data" before executing the program. If the parameter is set to 60, 

the computer will attempt to !:jead the data from the injxut stream, which 

^will be cards if the job is submitted through batch. If the job is sub- 

mitt-ed from a rembte terminal, the data can be either submitted as a file, 

with-P07 = 40, or else .tljiey can be appended to the input stream as card 

images, with' PO 7 = 60. 

If P07 is set to the incorrect value, the program will teminate 

with a message stating -that no links have»been read. 



P31 ~ # WEKHTS 



DEFAULT = 1 
MAX = 2 ' 



If only^a single indicator o£ link strength is used, P31 should' be 
left at the default value of on6. If two indicators are" used, P31 must 
be set to. two. For information on comb^ing indiqators, see the section 
on the link-weighting fonraila, page 12-15. 
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P08 - # OBSV/CARD lEFADLT = 1' 

V- >^ MAX ='10 ' 

This parameter indicates the maximira number of observations (links) 

that may appear on a card or c^rd image. Links are fomatted as shown 

below: 



R# I L#i I, in I L#3 1 [mr\ 



The first value must be the respondent number the subject nunber of 
the person reporting the link, ^ This value is followed by the fields 
describing the links- These fields must contain Either two or three ' 
-mumbers. They will either look liks this: | Cf | X | of this: 



I Cff I X • Here, the" "C#" refers to "contactee number" the 

subject number of the person to vAdn the link goes,. Th^ 'T^ or 'T' and 
"Y" refer to indicators 0/ the strength of the link. 

There may be up to ten links per card. If a person has more than 
.Pp8 links, the first P08>may be put on one card and the others on other- 
card(s). If more than one card is needed, the second (and subsequent) 
cards' must be formatted the same as the first. That is, the respondent 
number appears in the same columns on later cards as it did on the first 
one, ' TheriB is no limit to the number of links a node may ftavie," If' a 
node has less than P08 links, the rest of the card is left blank, 

P08 should be set to the maximum number of links that appear on any 
card (i-e,, if there are never more than 9 links on a card, it should be 
set to 9, etc) , 
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P09 NAME-WIDIH DEFmX = 0 

MAX = 2 

The program allows up to twenty columns o£ infbimation to be read in 
for each node. Since this is most often used to read in a list of* names 
o£ nodes, the list is called a ''namelist/' The information associated 
vdth each node is printed vdienever that node is referr^ to. I£ there 
. is no namelist, P09 should be set to 'zero. If ten columns oj less are 
used for each name, P09 is se't to one. If iqp to twenty colurais are used, 
P09 i^ set to two. The format of the namelist is discussed in the section 
on running the program. 

PIO -- LOW WEIGHT DEFmT = 1 

Pll -- HI WEIGHT DEFmT = 1 

MAX = 255 

PIO specifies the lowest legal strength a link may have. Links with 
strength less than PIO are dropped. Pll specifies the highest legal' 
strength a link may have. Links with strength higher than Pll are dropped. 
These limits refer to strengths after calculations by the link weighting 
formula vMch appears below. 

P15 -- EmWT DEFAULT = 1 

^ , . MAX = 4 

P37 -- CONS ' DEFAULT = 0 

P38 -- MX DEFAULT = 1 

f P39 -- M^ DEFAULT = 

'P40 -- CCX DEFAULT = 0 

P41 -- CCY DEFAULT = 0 

P42 -- MCPK DEFAULT = 0 

TTiese parameters are all used in the link weighting formula shown on the 
next page. 
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These parameters- govern the link weighting foimula, which is shown below. 



STT^Gffl^[(CCNS+MX-Xweight+My-Yweight) + ((Ca+Xweight) -(^^ -MCPK)] 



exp 



This formula provides a way o£ perfoiming a variety o£ transformations on 
the strength indicators. ^ The "Xweight" refers to the first indicator of • 
the .strength of the link. The "Yweight" refers to the second, if there 
is a second one (i.e., if P31 = 2). P31 «= 1, the value of Yweight will 
always be zero. ' \, 

The^link weighting foimila can be broken down into two parts. In 

the first/or linear, part, the Xweight is multiplied by MX, or P38. 

t 

The Yweight is multiplied by MY, or P39. These two products are then 
9dded to CO^S, P37, to fom a single value. This part is used either to 
fom a simple linear combination of two weights or to reyerse the scale 
x>n a weight. 

An exan^le of linear combination. Say we have two indicators-- 

time spent in face-to-face conversation and time spent on the 

telephone. If we decide that faCe-to-face coimts twice as much 

as telephone, we would set MX to 2.0, MY to 1.0, and CmS to 

0.0. We would also set COC^, CCY, and MCPK to 0.0 and EXP to 

1.0; J , ^ 

An exanq)le of scale reversal. Say we have frequency of inter- 

action as our only weight. Tt has been coded as 1 = several 

time? a day, 2 = several times a week, 3 * several times a 

month. We wish to reverse the scale so that 1 = several times ^ 

a month and 3 = several times a day. To do this we would set ' 

CONS to 4.0, MX to -1.0, and MY to 0.0. We would also set ^ ' 

CCX, CCY, and MCPK to 0.0 and EXP to 1.0. 

The second part of the link weighting formula is the cross product 
part. It allows a product to be foimed between the Xweight and the 
Yweight . 
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An example of cross products. Say we have tvx) indicators of 
strength: frequency (X-weights) and iir.portance (Yjweights). 
They have been coded as shown below and we wish to combine 
them as shown so that: 



FREQUBCY 


P1P0RTANCF: 


1 = once/moiajh 


1 = cricial 


2 = once/week 


2 = highly 


3 = once/day 


3 = moderately J 


4 = more than 


4 =' slightly 1 


once/day 



b4 

1-2 
^ 1 



• 8 


16. 


24 


32 


6 


12 


18 


24 


■ 4 


8 


12 


32 




4 


6 


8 



^ 2 
IMPORTANCE 



An interaction that is of crucial importance and that happens 
several tines a day is weighted ^S*'32, and one that is only slightly 
important and^that happens onl5^ once a month is weighted as 2. 



i' First, we have to reverse the importance scale. We do this 
setting CCY to -5, which gives us a new importance range of -4 to -1. 
Then we set CCX to 0 because we want to keep the frequency coding 
intact. Finally, we set flOPK to -2. This (a) doubles the values 
we get for the products, and (b) reverses the sign caused by the 
reversal of the . importance scale. IVe would set CONS, MX, and W to 
0.0 and lip to 1. Thus, 

^ STRENfrni = [((0)+X-weight)*((-5)+Y'Weight)*('2)]-^ 



The entire quantity in the line^ part is added to the entire quantity 
in the cross-product part.* At this time, P15 , EXP, may be used to raise the 
sum to the Second, third, or fourth power. If ""this is not needed, P15 is left 
at the default .value of 1. P15 may take no values other than 1, 2, 3, or 4. 

Because the maximum value of Pll is 255, the upper limit 'to the final 
strength is also 255.' That is, the strength of a link may never exceed 255. 
If a scile for duration is used, coded- in minutes , and the range of values 
is 1 to 1000, the range could be reduced by se-tinp }^ to 0.25. This would 
give a range of 0 to 250, which is acceptable. 

The final value for strength is expressed as an integer. Thus, values 
are truncated (rather than rounded) to the next lowest integer. A fractional 
value less than one say 0.75 or 0.99 would be truncated to zero. 
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P06-# RAW PRINT DEFAULT f 10 

The ccOTputer prints both the raw data and the final strength values 
for links until it has. read P06 acceptable links. This means that if 
only one out of every four links is acceptable after transforming accord- ' 
ing to the link Weighting formula, the conqDuter will print the first 
forty raw links as well as the first ten good final links. This is useful 
in 'checking the link weighting formula for correctness, 

P34 — MEAN STRST DEFAULT ^ 0 

If this parameter is set to 'one, the mean value for both halves of - 
each reciprocated link will be calculated, printed out, and used to re- 
place the original values. If it is left at the default value of zero, 
the mean will be calculated and printed, but the original values will be 
retained. 



' C, Output 

The program begins 'its printout with a list of the parameters and their 
settings. A portion of this list is shown on the top of the next page, ^ 

"If a parameter is left at a default v^iue, the word ''DEFAULT'' will appear 
by that parameter. If the user supplies a value, the word 'TfSER" will appear 
instead. This list is followed by the namelist, if there is one. In this' 
namelist, the ''names" will appear in numerical order, by subject number, 

^Following the namelist will be the first P06 good links, 

A After all the links have been read into the computer, links will either 
.be dropped or added, depending on the .value of P03, The computer indicates . 
how many links were dropped or added in this process. 
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PARAMETER m. 
PARAMETER NO. 
PARAMETER NO. 
PARAMETER NO. 
PARAMETERI^O. 
PARA^CTER NO. 
PARAMETER NO. 
PARAVIETER NO. 
PARA^^ETER NO. 
PARAMETER NO. 
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PARAVIETER NO. 
PARAMETER NO. 
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CHI ^VEIGHT ) 


VALUE 




10 


**USER** 


12 


CINOPERATIV) 


VALUE 




0 


DEFAULT 


13 


CmSlY HIST) 


VALUE 




1 


DEFAULT 


14 


(SCAN RDIUS) • 


VALUE 




200 


DEFAULT 



A portion of the parameter list 



Next comes the first major part of the output: the^ link list. In 
this list all the links are displayed, beginning with the links for Node #1, 
then Node #2, and so on. A part of this list is shown in Figure 11, If 
a link is reciprocated, there will be an after the node number to whom 
the link goes. Similarly, there ^11 be a 'V' for an unreciprocated link 
and an "A' for added links. In the exajiple shoxm in Fig^ure llr. Node No. 2 
listed Node No. 8, but Node No. 8 did not list Node No. 2. On the other 
hand, Node No. 12 listed Node Noi 2, who did not list Node No. 12. 

To the right of the reciprocation indicator is the word "STRENGTH='' 
followed by the strength of the link as reported by the respondent. If the 
link was reciprocated, the words "STRENGTH IN='' will appear, followed by 
the strength as reported by the other person. In the example. Node No. 2 
reported a link to Node No. 3 with a strength of 5. Node No. 3, however, 
reported a link to Node No. 2 with a strength of 7,. The nionber after the 
words 'WM STR. =" is the mean of incjoming and outgoing links'. The 
number after "DISCR=" is the discrepanq^ between incoming strength and out- 
going strength (calculated as outgoing minus incoming). Finally, "ID=" 
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is followed by the "name" o£ the person to \jhcm the link goes. 

After all the links for the node are listed, there appear two 

tables. The first is a "DISCREPANCY TABLE/; where the differences 

between outgoing and incoming strengths are analyzed. Obviously, this 
, «t{ 

can only be done for reciprocated links. There are two rows in this 
-table. The first refers to actual values, while the second jpefer^ to , ^ 
absolute values (ignoring the signs), TheTe are also two columns. In . 
the "SUNF* column the discrepancies have been added tS^getheiv^to a total. 
In the "MEAN" column the sum has been divided by the number of recipro- 
j::ated links, which appears to the right of the^ table,- ' - 

If the values in the "ACTUAL"* row, are small, this could be due to 
one of two situation^. Either all the discrepancies are small, or else the 
discrepancies are largo., but the positive ones are balanced out by the 
negative ones. If the values in the "ACTUAL" row are large, this means 
the node consistently differed in estimating the strengths of links 
from those it was linked to, and that the difference was usually in the 
same direction. In other words, such a node co\ild be said to be an 
overestimator if its "ACTUAL" values were large and positive, or an 
under estimator if its values were large and negative. Thus, the "ACTUAL" 
values are useful to see if there is any systematic bias in the direction 
of the discrepancy, ' * . 

The values in the "ABSOLUTE" row, in contrast, reflect only the 
magiitudes of the discrepancies. Nodes with low values here reported 
strengths that were very close to the strengths reported by the nodes 
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they were liriked to . Nodes with hogh values , on the other hand were 
simply not in agreement with the nodes they were linked to.- 

Following the dis'-repancy table is the "STRENGTH .TABLE." The nuubers 
under "OUT' refer to strengths as reported by the node whose niihber 
appears at -the top of the link list (Node No. 2 in the example) . The 
numbers under "IN" re^^ef ^o strengths as repojted by the nodes linked to 
the respondent. The niSers under "TOTAL" refer to all the strengths 
ccmbined. The first -row refers to reciprocated links only, the second 
to unrceiprocated links only, and the third to all links combined. There 
are thus nine cells in this table. 

In each ^ cell are: the sum of the strengths for the appropriate 
links (after "S=") ; the number of links (after "N") ; and the mean strength 
(after "M="). Whenever there is a blank entry, with siji^jly a ".", there 
were no links for that cell and the value is zero. 

With this table we can make statements like the following: "On the * 
whole, for Node 2, outgoing "links (i.e., links reported by Node 2) were 
stronger than incomiiig:_links . However, the difference was greater for 
unreciprocated links, /i= 8.0-6.0 = 2.0, than for reciprocated links, 
•A= 7.0-5.75 = 1.Z5." or "Reciprocated links with Node 2 were stronger 
than unreciprocated links." . . 

There is a discrepancy table for each node that has some reciprocated 
links. When there are no reciprocated links, the discrepancy table 
and the top row of the strength table do not appear. In addition, if 
the node has no links^ th^ computer prints -'Ai^^5j^^^fe'L.after the node 
number, and skips to the next node. 
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After all the' links for all the nodes have Seen listed, there 
appears a small table labelled ''RKCIPROCATION ANALYSIS.'' J^ere the en- 
tire set o£ Mnks are broken down-' into types, according to whether they 
were reciprocated (two-way), unreciprocated (outgoing), or added (in- 

, coming) . . ' * 

Finally, there is a section headed 'STRENGTH DISTRIBlfTION ANALYSIS'' 
v^ere the distribution o£ strengths, is analyzed. After a s*el£- explanatory 

' description of the range of strengths appears a histogram where each 
legal strength value has a row and the length of the bar in that row is 
proportional to the number of links with. that strength. ^ If the longest 
row has more than 100 links, the lengths of all rows will be divided by 
10. If the longest row would still be over 100 X's, the longest row has 
le3s than 100 X's. The exact seal? factor is indicated at the top of 
the table. An example of this histogram is shown in Figure 12. 

2. Preparation for Group Detection 
A. Review 

The computer begins this segment by identifying all the non-partici- 
pants. It separates them fran the participants since they are not used 
until the final stage of analysis. Following this, the computer treats 
each link as a vector going from the respondent to the contactee. Links 
are weighted according to their strength and the within-group we.ighting 
factor. The mean ve^fj^j^ is calcu^lated for each node and used to re- 
^^.JLocate the node for the next round of means. . The whole process is re- 
peated a number of times. 
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B, Pargmeters • • 

P44 2-STEP m /dEFAUIT = 1.00 

This par.ameter is used to adjust the contribution o£ the within-group 
weighting factor. This factor, used as an indicator of the probability 

that the link is a vdthin -group link, ' 
is calculated as the number of two- 
step links connecting the nodes plus 
one.. In the example to the left, the 
' weighting factor for the A-B link 
would be three, because of the in- 
direct links connecting A and B 
through C and D. 
The f omula used to calculate the new mean for node j is : 




SiMiCwfi>P44) 
MEAN^j = Si(w£i.P44) 

..(Discussed also on page 29) 

By adjusting P44, the influence of the within-greup weighting factor can 
be varied. P44 is usually set to 1.0 Lower values are not reconmended ; 
the effect of higher values, has not been tested. 

P05 --N OF ITERS ' DEFAULT = 4 

This parameter specifies the number of iterations to perfoim the 
number of tijnes means are to be calculated. Four is usually sufficient. 
For datasets with very large numbers of links, it may be necessary to use 
six or seven. The best nunber is detemined by experimentation. 
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C, Output \ \ 

Output for this segment begins with a list of non-participants as • 
the coiTOuter finds than. Isolates will be located first, followed by 
tree nodes. Aftet all non'-participants are located, the ccniputer removes 
links to non-participants from the link lists' o£ participants. This is 
done to sin5)lify the refet of the analysis.- There is no more output frm 
this segment unless there are not enough participants to continue. If 
this is the case, the -program will teminate here with a message.. 

3. ^Initial Group Detection 

A. ; Revie w 

The result of the preceding Segment ds a continuum with nodes 
scattered alchg its length. In this segment, the canputer examines this 
continuum by moving a "window*' down the continuum, counting* the nunber 
..of nodes appearing in both halves , of the window. These numbers are dis- 
played in a density histogram, and analyzed to locate boundaries of 
groups. A group' boundary is .drawn \Aenever the transition fron one loca- 
tion of the window to the next causes a shift in density values that is 
larger than the sensitivity parameter. 

B . Parameter s ^ \ 

P14 - SCAN RADIUS DEFAUI,T'» 200 

This parameter governs the radius of the window that is used to. cal ^ 
culate densities along the continuum. Hie value of P14 is actually" one 
hundred times the width of the window, where the continuum is N units 
long, v^iere N is the number of nodes in the- network. A value of 200 has 
been found to be, optimal for most networks. If the density histogram 
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appears to be Slurry" with gradual, smooth transitions from group to 
group (i.e., as in below), the radius should be decreased to a smaller 




. A. A blurry histogram - • / ' B. A' sharp, histogram 

If. 

value. I£ the vslue is made too small, however, the histogram will be 
very thin and spotty, as shown in ''C' below. In this case there will 
probably be no groups identified. 

C. A thin spotty histogram 

» 

P13 ENSTY HIST , OBFAULT = I 

I£ P13 is set to zero, no density histogram will be printed. It 
is recommended that this not be done, since the histogram provides 
valuable information in a relatively con5)act form. 

P23 - SRP-SNSVTY DEFAULT = 100 

This parameter is discussed on page 35 . Whenever the con5)uter detects 
a group boundary, it draws a line across the histogram. If it is felt ' 
that the computer is not splitting the continuum into enough groups, the 
sensitivity can be raised to a higher value. Smilarly, if , the computer 
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is making too many splits^ cutting groiqps apart, the value can be lowered. 
The value should riot be changed by more than twenty units at a time, as 
it has drastic effects on the pattem-fecognition routines that look for * 
groups. ' , 

C. 'Outp ut - ' 

The output from this section consists entirely o^*" the ^ density histo- 
gram that has been discussed above. A segment of a density histogram is 
shown below. The numbers along the left side are node numbers. The 
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numbers immediately to the right 'o£ the node^ numbers (the zeros and ones) 
refer to the. "scanning mode" o£ the cc^uter at that point on the con- 
tinuum (which has been turned vertically so the top is the left end and 
the bottom is the right end) . When the conputer is working on a group, 
it is ill group mode aHd there will be a "1" to the right o£ the subject 
number. When the caiqputer is between groi5)s, there will be a "0" instead. 
The horizontal lines are boundaries o£ groups. The numbers o£ "X's" iji 
a row equals the number o£ nodes visible through the window v^en it is 
centered over the node \^ose number appears on the le£t end o£ the row. 

At the bottan of the, histogram is the phrase ''NGRP^ FROf^l GROUP IS/' 
follbwed by a number. This is the number o£ groins found by the conpiter 
in its analysis* o£ tne density histogram. If there are no groups, the - 
computer will stop here with a message. 

4. Application of Fonnal Cr iteria to the Tentative Solution 

A. • Review - - , ' 

In this segment. the computer applies the formal criteria to the ten- 
tative solution provided by the third segment. First, individual nodes 
are tested to see if their role classification is consistent with the 
role definitions. This is done by computing the appropriate proportions 
of linkages with group* members and coji5)aring the results to the criterion " 
lervels as specified^. ^ - ' 

If there'^'are enough type ''others" at this point, the computer. at- 
tempts to construct additional groups from theS^/nodes ^ 

The computer then proceeds to test e^a.ch group to see if it meets 
the criteria for groups s Most of this, work is done with the aid of the - 
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"distance matrix*' which is discussed on pages 39-42. If a group is not con- 
nected, it is split apart into a connected part and all the rest. I£ * ^ 
the variance o£ row means in the distance matrix is high enough, the 
cai5)Uter tries to split apart the groiq) by removing critical nodes, I£ 
the conpiter succeeds, it makes two groups out o£ the remaining members. 
It then applies all the relevant tests to those new groins, I£ the com- 
puter does not succeed at splitting apart the group, it returns all the 
nodes it took out in the process o£ trying to split the groiq). 

Finally, after all groups have been tested, the computer applies the 
appropriate criteria to the nodes which remain outside groiq)s, 

B . Parameters 

P36 PERW DEFAULT = 50,01 

This is the «. -percelitage used in all the criteria. It is discussed 
on page 5 . The default value of 50.01% is'-as low as possible for unam- 
biguous 9lassifications of nodes. Because higher values have not been 
" tested extensively, it is hot possible to say how they affect the opera- 
tion pf the program. 

* ' P22 -- MIN SPLIT DEFAULT = 12 

MIN = 5 

P24 -- SPLIT lEV DEFAULT = 30 

MIN = 5 

The computer will attenqjt to split any group that has at least 1^22 
members by removing critical links, if the standard deviation of the row 
'means in the distance matrix is greater than the value of P24 divided by 
100. That is, if both P22 and P24 are left at their default values, the 
conputer will attempt to split any groiq) having 12 or more members when- 
ever the standard deviation of that group row means exceeds 0.30. 
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P43 - DRDP-SPLIT ffiFAULT = 0.10 

This^ parameter specifies How many nodes may be rmoved in attenpting 
to split a group. I£ P43 is set to a number greater than 1.0, the number 
is the largest number of nodes that will be removed. If P43 is set to a 
fraction between 0.0 and 1.0, up to this proportion of the group will be 
removed. Thus, a value of four means that to four nodes will be 
removed. A value of 0.1 means that up to 10% will be removed. 

C.nf Output 

The output of this segment begins with a brief reporting of the re- 
sults of the tests of individuals. When a node is reassigned to a dif- 
ferent role, the new role is indicated. After this has been done, the 
word "GRZAP" appears, followed by a list of the tentative groups (they 
" have not yet passed the group criteria testing) . 

NOTE : The groups at this point are nunbered from 1 to N, vdiere 
, N is the number of groups. This nunbering may change 
as the groups are tested. In fact, after testing there 
may be "empty groups" numbers for which there are no 
groups. For exainple^ there may be groups numbered 
1, 2, 4, 6, and 9. There are no groups numbered 3, 5, 
7, or 8. There are only five' valid groups. (This is 
inconvenient, I know. I will fix i% someday, maybe.) 

The computer now begins to ^ply the criteria to the grdl^s. The 

first step is to construct a ^distance matrix. Because this is done in 

a routine called "mffiCT," the phrase "NOW ENTERING CONNECT FOR GROUP X" 

spears at the top of the page. Inmediately under this is a line that 

tells whether directed or non-directed links are assumed, (llon-directed 

links iji5)lies a symmetrical xiistance matrix.) 

The cmpnter constructs a binary matrix, where the i, j entry is "1^* 

if there is a link from node i to node j. All other entries are zero. 

ERIC 
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The number of one- step links is printed out. The computer then raises 
the matrix to higher and higher boolean powers, where the result after , 
each power is to put the number of the power in the entries corresponding 
to pairs of nodes that can reach each other in a number of steps equal 
to the number of the pou^r. Thus, on the seS&nd power, a "2" is placed 
in the i,j entry of the matrix if node i can reach node j in two steps 
(i.e. with one intermediate node). At each power the number of additional 
connections that are made is printed. 

This process stop^ when either (a) all pairs of nodes are connected; 
or (b) no new connections are made when going from one power to the next, 
which iinplies that no new connections would ever be made by going to 
higher powers. The cOTiputer prints our the highest power used, together 
with the result (i.e. ''CONNECTED AT LEVEL' V or ''CONNECTING HAS STOPPED AT 
LEVEL 4'\).' 

The computer then prints out the distance matrix. A matrix assuming 
non-directed links is .always constructed, regardless of the value of P04. 
This matrix is used for all testing. However, when P04 is set to 1 
(directed), an additional matrix< is constructed with directional links. 
This matrix may be asymmetrical. When this matrix appears, it is identi- 
fied as having used directed links. 

For large groups, the entire matrix might not fit on a single page. Tf 

this happens, the matrix is printed in strips. To reconstruct the Tnat/ix, 

«• 

simply put the strips together. 

After the matrix is printed there is a section headed by the phrase 
"ANALYSIS OF DISTANCE ^lATRIX". Here there is a table with six columns. 
The first column is the node number. Column Two'ha^ the sum ofv^ll entries 
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in that node's colunn in the matrix. Column Three has the mean. The 
mean is calculated byWviding th^ sm by N-1, v^ere N is the number of 
nodes in the group. The "-1" is because the elements on the main diagonal 
"are not counted. Rows Four and Five have, row totals and means. Colinm 
Six oiily appears i£ namelists are used, and it has the node'.s '*name." 

Below this table is the group colunmr average the average o£ all 
the numbers in Column Three -- and the standard deviation of the column - 
averages for the members in the group. It is this last number that is 
used to decide whether or not to try to split the group. 

Should the con?)uter decide to try to ^^t a ^oup, the \*Dle pro- 
cess is recorded for later use. When a group is' split, there will be at 
least two matrices printed -- one before splitting and one after splitting. 
The second one should be used, as the first one is now obsolete. 

After the last distance matrix and analysis table, there is a list 
of nodes not in groups , together with their final classifications Cthis 
list does not include non-participants). 

5. Final Results and Control of Output . 

A. Review 

In this section, we review the parameters that control the output, 
along with the tables that summarize the results of the analysis. 

B. Parameters • - . . 

P17 - FILE CmPUT ' ' lEFAULT = 0 
If P17 is set to one, a file called •:puncH" will be generated after 
execution. This file contains all the list structures which represent the 
network and may be used as input to subsequent analysis programs'. For 
more infoimation about this file,j«-ite the author. • 
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^ P18 -- PRINTO SUP' DEFAULT = 0 

If this parameter is set tc one, the list o£ lin]cs at the begijining 
o£ the program will be suppressed. 

P19 -- GRID SUP . . DEFAULT = 1 

I£ this parameter is set to zero, the grotq) structure will be printed 
but at several intemediate stages in the analysis. This infomation is 
useful- for tracing th^ groups as they go through the various tests. 

P20 -- GROUP SUP . JEFAULT = 0 

' If this parameter is set to one, the computer will not print the 
final tables describing the ccmminication structure of each groi^. 
(These tables are described below.) 

« P21 -- MAX OUTPUT . . ' EEFAULT = 15 

This parameter is used for debiigging the progj-am. Values lower than 
15 cause great volumes of cryptic infomation to be printed. (This para- 
meter will soon become inoperative.) 

P32 -- ISOSUP EEFAULT = 1- 

If this parameter is left at the default value of one, isolates type 
one will not appear on .the link list at the beginning. (These are the 
nodes that have no iMs.) To prevent the suppression of these links, 
set P32 to zero. 
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\ P33 lETAILS DEFAULT = 0 

I£ P33 is set to one , the conQ)uter will priat more details about 
testing both the individuals and the groups for the criteria. This is 
useful for verifying the "goodness^' of the final solution. However, in 
most cases this infoimation is not needed. As the program is tested 
more thoroughly, we can be more confident in its results, and this infor-- 
mation is less and less valuable, 

P35 PUNCH' DECK DEFAULT = 0 

If P35 is set to one , the computer will punch a deck which contains 
essentially the same information that appears on the very last table. 
This includes the node nunber, the role assigned to that node, the group 
number if the node is a group member, or ah isolate or tree node attached 
to a group member, the integrativeness score for the node, and the ^'narae'* 
for the node. The format is as follows: one card/card image per node; 
C4I5,2A10)*, ' This infoimation appears on the file called "PUNCH/' 
Output , _ 

The output from the final section of the program comes in three 

I 

parts. First, there is a set of tables for each group. Second is a 

complete listing of all nodes that are not group members. Finally there 

is a nunerically ordered summary of all the nodes , with a description 

of each node's role and integrativeness score. 

We start with the group tables, which are the first to appiear, 
♦ 

The tables for the first valid group begin with, the phrase ''START OF 

INFORMATION FOR GROUP X,"*v*jere X is the group ^s identification number. 

The computer then prints information about each node in the groiq). The 

* 

e O 
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information for a node begins with a link that says 'MEMBER NUMBER X ID = 

I' v^ere the node^s subject number is X. The node's 'Yiame'' appears 

after the ''ID='\ Following this line is a list of all the links with 
that node. Each link is described in a line that begins with either ''LINK 
IVIIH", '^INK TO", or "LINK FROM", followed by the number of the node at the 
other end of the link: '^INK IVITH" means the link is reciprocated. "LINK 
TO^eans that the link is unreciprocated, not being returned by the node 
at the other end of the link. "LINK FR(K' means that the link was not 
reciprocated, not being returned by the group member whose links are 
beingg listed. A "TO** link would be given a "U" in the big link list 
printed at the beginning of the printout. A :FRO^f' would be an "A" and 
a "WITH" would-be an "R". 

Following the number of the other node is a description of the type 
of link. This could be either ^TVITHIN-GRP", v^ich means that the link is to 
another member of the same group; a "BRIDGE" link, which means it is a link 
to a member of another group; a "LIAISO^^* link, vhich means it is a link 
to a liaison; or an '^OTTHER" link, which is a 1^ to a type "other." 

To the right of the link type indicator are the letters 'W=" followed 
by the within-group weighting factor discussed on pages 25 and 26. (This 
weighting factor equals the number of two- step links ^^fpnnecting the nod|jj|^, /' 
plus one.) The weighting factor is followed by the strength of the link^s 
reported by the group member being analysed. ^ (if P34 was set to one and 
the link was reciprocated, the strength wiM be the mean of both halves 
of the link.) Finally, the "name" of the node to whom the link goes is 
printed. 
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At the end o£ the description o£ links for the node is the phrase 
INTRGRATIVENESS OF NODE K.^ ISV' where 'T^ is the node. number and 'T' 
is the integratiwness o£ that node, Integrativeness i^ calculated as 
number o£ links between the nodes linked to the original node divided 
by the largest possible number o£ such links. The value ranges £roin 
zero to 6ne, ' ' " 

Directly below the integrativeness score is a ''LINK AN.MYSTS MATRIX' 
which is actually two matrices. The one on the le£t is calculated using 
numbers o£ links, v^dle the one on the right is calculated using strengths 
o£ links. Each matrix has £ive columns and six rows. The £irst column 
has values £orTl\D-WAY (reciprocated) links; the second has values £or 
OUTGOING (unreciprocated) lijiks ; and the third £or INCOMING (added) 
links. The £ourth column has row "totals and the £i£th row percentages. 
The £irst row has numbers £or within-grot^) links, the second £or between 
groip links , the third £or liaison links , and the £ourth £or other links . 
I^ie £i£th row has "columnS:otals, and the sixth, column percentages. 

With these tables, it is easy to make statements about the percen- 
tage o£ within-group linkage that. is reciprocated, the relative strengths 
o£ within- groiq) links and bettveen-groi^) links, and so on".' 

After all the nodes* in the groi^) have been analyzed, there is a 
"GROUP LINK AJIALYSIS MATRIX/' The tables at the top o£ this set o£ 
matrices are identical to the link analysis matrices printed £or each 
individual node, with the exception that all links with groqp mmbers 
are included, instead o£ only links with a single node. In addition to 
the "NUMBER OF LINKS" and 'STRENGTHS OF LINKS" tables, there are two 
other tables - 'AVERAGE WEIGHTING FACTOR ' and "AVERAGE STRENGTH.-" Both 
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o£ these tables have columns for two-way, outgoing, incoming, and total; 
both have rows for within-group, between-group, liaison, other, and total* 
The four tables in the group link analysis matrix |re set off from the 
rest of th^ printout by dashed lings drawn across the page. 

At the bottom of the group link analysis matrices is a specification 
of the type of relation chosen (P04) and a calculation of group connective 
ness (density) . Ihis value is calculated as the number of within-'group 
links divided by the maximum possible. At this point, the computer moves 
onto the ''START OF INFORMATIOhP^ for the next group. 

After the last set of connect iveness calculations are the lists of 
nodes that are not group members. This set includes, in order, isolates 
type one, isolates type two, isolated dyads, tree nodes, "others,'' and 
liaisons. For each category there is* a list of nodes that fit that 
category, witli an analysis of all the links for that node. This break- 
down specifies who each link is with, what kind of node they are, and 
vdiat that node's "name" is. When appropriate, the in tegr at iveness of 
the node is printed along with a breakdown of links into two-way, in- 
coming, and outgoing. 

After the last liaison is described, there is a small table telling 
how many nodes 'of each type there were. Finally, the last table printed 
as* part of the analysis contains a list of all the nodes, in ascending 
nmierical order, ^ with a specification of the role each node is assigned 
to, the group they are a member of or attached to, and their integrative- 
ness score and' "name." Integrativeness scores range from zero to 1000 
on this list; these valueslnust be divided by 1000 to correctly locate 
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the decimal point* Integrativeness scores for tree nodes having negative 
values should be ignored* This concludes the output ^of the Network 
.Analysis Program* * ' . - 



PART FOUR ^ 
USING THE NEIW)RK ANALYSIS PROGRAM 



This part is divided into two sections. In the first, the actual 
running of the Negopy Program is discussed. This section includes the 
specification of parameters, the setting up of input decks, and so on. 
The second section covers the less mechanical aspects of using the pro- 
gram: error messages and vihat they mean, how to interpret strange 
results, "fine tuning** the program, and known bugs in the program. * 

!• SETTING UP A NETWORK ANALYSIS RUN 



In general, any Network Analysis run^ijidludes two kinds of infor- 
mation: control cards and data. The data, of course, include all the 
cards with infoiroaticn describing the links, as well as a •'namelist," 
if there is one. There are two basic types of control cards system 
control cards and Negopy control cards. The system control cards tell 
the computer \ihat to do '-- vtfiich program >to execute, how much manory 
is needed, how long the program may run, and so on. The Negopy control 
cards set the parameters in the program, describe the foimat of the ^ 
data (and the namelist) , and provide some other infonnation the program 
needs to execute the run. Since the system cards come first, they will 
be discussed first. 
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System Control Cards 

The system control cards will look like either "A" or" "B" as shown here. 



JOB, 


JOB, 


ATTACH, A, NEGOPY. 
A. 

7-8-9 (EOF) 


ATTACH, A, NEGOPY." 
ATTACH, nATA,yourdatafile. 
A. 

7-8-9 (EOF) 


A 


B 



-'The cards in Set A are used when data are on punched cards. Tlfe cards in 
^ B are us.ed if the data are on a permanent file called "yourdataf ile . " 

> The first card is a job card. Here you specify your account number, 
time limits, memory length, and so on. In general, memory requixemehts 
are about 70Kg for small networks, 100 to llOKg for medimum ones, and ''^ 
ISOKg or more for very large ones. Fjcperimentation is needed to determine . 
the best numbers to use.- 

The .program usually runs in under three minutes. For very large runs, 
it is a good idea to allow five fo ten minutes, however. The number of pages 
of output varies roughly as the number of nodes. A lower limit should be 
about 100 pages. Generally, an overestimate is safer than an underestimate, 
since a low estimate will necessitate re-running the whole program. 

The second card fetches the program pbject deck i^^iich is produced by 
^ compiling a source deck (FORTRAN) and cataloging the LGO file-. 



H^next card in-B attaches-^ flie ^dats. file and' gives it the local 
name ''DATA" The "A." card causes^thfe" cortiputer to begin execution of 
the program,, and the 7-8-9 card (end 'of fil^)^" terminates the system 



control cards. 



Negopy Control Cards . ^^^j^ 

Ihe^Negopy input cards look like either A, B, C,.or D below. 




Label Card oiie 


Label Card One 


Label Card One 


Label Card One 


Label Card Two 


" Label Card Two , 


Label Card Two 


Label Card Two . 


Parameter Card. ' 


•Parameter Card 


^Parameter Card 


Pa^:;ameter Card 


Parameter Card 


Parameter Card 


Parameter' Card 


Parameter Card 


,.7'-8r9'^ 


. ^ 7-8-9 


7-8-9 


7-8-9 


Foimat Card for 


Fomat Card for 


Fpimat^ Card for 


Fonnat Card for 


Data 


Data 


■ Data 


Data . ^ 


Data 


7-8-9-^" 


Fomat Card for 


Format Card far 






Name List^ 


Name List 






Namelist 


Namelist 


. 7-8-9 • 












7-8-9 


7-8-9 






Data 






7-8-9 




A 


B 


C 





The cards shown in A are used when the data are on cards and there is no 
nainelist* The cards in C are used if the data are on qafds and there is 
a namel^ist . The eards in £} ate used if the data are on a file and there 
is a namelist . In the rest of this section, the^ollowing topics will 
be covered: label* cards,* parameter cards, data cards and data foimat 
.cards, ,and namelist (;ards and namelist foimat cards. 
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Label Cards . ^ 

There are always two label cards submitted |with .each run. The 
label cards may contain anything the' user cares to^ut tfiere/ Whatever 
is put on .the: label cards will be printed from^tijWto time cn the print- 
out 'as identifying infoiroation. Therefore, it ^s useful to describe the 
dataset being used^ the^date of analysis, and any other identifying in- 
fomation that may be helpful at some later tijne. 

Parameter 'Cards \^'* ' 

Parameter cards are used to *change parameters to values other than 
default values. If no parameter cards are used, all parameters will re- 
main set at their default values • 

There may be up to six parameter^ qaixis used in any single ym. 
Usually only one pr two will be needed, however. After the last parameter 
card there is alwa/s a 7-8-9 card. 

• Each^arameter card may be divided into eiglit ten-column fields. 
Che parameter can be set on eacli card, although any number less than 
eight may be set on any particular parameter card. 

Each ten-colyfei-fietld has this foxmat: 



/ • ' I r |A A =111 Y I I I 

( . 1 2 3 4 S 6 7 g y 10 

T^ first ^^urrai CColunn-l, -or Column 11, 21, 31, etcO is always empty. 
The second ^^lurai CCplumn. 2, 22, 32, 42, etc.) always says 'T''. The" 
third and fourth/column^ (Column 3 and 4, 13 and 14, 23 and 24, etc.) have 
the number. of the parameter being set. The fifth column (Column 5, 15,, 
2^'^^'^etc^) always'has an ''='\ 'The ranaining five columns in the field have 
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tlie value the parameter is to be set to. This value must be right -- 

« 

justified in the field- For exanple, i£ a parameter is being set to a 
value of one, the value field would .either be four blanks followed by 

4 

a "1" or four zeros an§ a "1", ^ 

The first 35 parameters (POl to P35) may be set to integer values (i.e., 
there may or may not be decimal points In the value fields for the 
first 35 parameters). The last ten parameters (P36 to P45) require 
decimal points. ^ 

As many parameter cards as are needed (up to six) may be used. If 
eight parameters are fo be set,- they* can all be done on one card. Alter- 
natively, they could bo set on two cards, each of '^ch has four para- 
meters. The parameters may be set in any order. No parameter may be 
set twice. An exan^le o£ a set o£ parameter cards is shown belcw. 



]|2BMl5|6f718l9|0 lJ2(3|4jSl6f7|a9|0 112|34|5|6|71BB|[ 



CARD 1 YWmym 

CARD'2 
CARD 3 



002 




151617 



=53 



2mBfc|7BB|D]|2Cj4B)5pT8l9p 



In this example, the following parameter settings were made: 

POl = 200 P08 =y V07 = .60 P34 = 1 
P02 = 1000" Pll ?M0 P09 = 2 



Data Cards and Data Fo 




Cards 



The format of data cards/Was discussed earlier on page . A quid- 
review here would be helpfulK There can be up to ten links en a data 
'card. The respondent's number must cone before any links. 'Each lipk 
requires eitlier two or three fwrnbers to describe it. The first is always 
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a contactee number. The second is an ^Gveight, and the third, if it is 



numbers inpst be in integer .(I) fomat, \Adle X-and Yweights must be in 
floating point (F) format* There may be as many blanks as is desired 
between the various fields. Tlaere may be other information before or 
after any of the numbers on the card, as long as the order of; Respondent # 

^ [Contactee #^,')&veight (Yweight)] [Contactee #, Xweight (Yweight)] 

is followed across the card. 

For example, Colunns 1-5 might have sme identifying information 
like project number. Columns 6-10 might be respondent #. Links could 
each require ten Columns, so that contactee number^ would be in Columns 
11-15, 21-25, 31-35, 41-45, 51-55, and 61-65. 3&^eights might appear in 
Cblumns 17-48, 27-28, 37-38, 47-48, 57-58, and 67-68. The other columns 
^re left blank. The format for this coding plan would be: 



present, is a^Yweight. Both the respondent number and the contactee 



Y 



(5X, 15, 6(15, X, F2.0, 2X)) 



The "5X'' tells the CGH5)uter to skip five columns* 



The fi^st "15*' tells it tdiat a five-column integer is next • 
(the respondent number). - 



The "6(" tells* the computer that a five-column integer is next 
(a contactee number) . 



The next "15" tells the coHDuter that a five-column integer 
is next contactee jauraber) . 



^ The 'T' means "skip a coliimi. 



The "F2.0" tells the conjHiter that a two-colurm floating 
point number, with no places to the rigltt of the decimal, 
is next (this is an Xweight). 



The "2X" mejms "skip two columns^' 
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The first means "tMs is the end o£ the part that is 
repeated", and the second neans "this is tke end of 
the card/* ^ 



c 



T!b(6 foimat is punched on a card, starting in Coliinn One, leaving no 
■spaces between characters • No other infornnation may appear on this 
card. 

Namelist Cards and Namelist Fomat Cards 

If there is a namelist (i^e., P09 is set to "1" or "2^, it will 
be structured as a list of elements. Each element must have first the 
node number and second that node's **name/* The name may be up to 
twenty columns wide. Elements ma^^ be punched one per card or several 
per cai^. The namelist format card will tell the conqputer how many 
*^ames" there are per card, ^ ^ 



Example! 




!2name" per card, with P09 = 1, 



. • (3X, IS, 2X, AlO) The con^juter will get node, nunbers from 
QJliomns 4-8, skip Cdlurais 9 and 10, and get the *^ames" 
from Columns 11-20. 

' Example. Three **names" per card, with P09 * 2. 

^ (3(15, 2A10)) The "3" tells the canputer that there are three 
elanents per card. The first five columns, of each element 
are the node number, and the next twenty columns are the 
^ame/* 



The namelist is teimiaated by a 7-8-9 (eof) card* If some nodes do ^ 
not\have ''names,** the ccii5)uter ydll assign blank names to them* The 
names do not have .to be in any particular order within the nanelist/ 
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* * 

II. MISCELLANEOUS ASPECTS OF RUNNING TTE PROGRAM 
A. Error Messages or Warnings 

The program prints several kinds o£ error messages and warnings 
when certain situations are encountered. These will be covered in the 
order' they may appear. ^ . ' 

1. 'T)ANGER. YOU -ASKED FOR XXX LINKS. YOU (MX HAD YYY/' this 
message appears after the raw data have been printed. It usually means 
that an error was made in either the deck set-up (the .cards were not in 
the correct order) or the input format -card for the data. 

2. "DANGER. YOU liAD MORE I^INK Si' THAN Yai SAtD. THE REST WTLL BE 
IGN(^D/' The meaning o£ this is self-explanatory. To correct thh 
situation, set P02 lo a -higher value. 

3. "ALL MEANS Ttm OR EQUAL TO ZERO. PROGRAM IVILL STOP HERE/' 
This indicates wither that the strengths of all links^^^ very low or that 
there are not enough participants to go on with the analysis. Review 

the data format to see that the J.inks are being properly read. . ' 

4. 'THERE ARE NO GROUPS FOR THIS RIfl.'' For one reason or another, 
there are 'no groups. . This 'could be due to the data— all nodes are 
either non -participants or else there is no organization into groups-, in 
which case all nodes will be classified as tyoe "other''. The problem 
could also he due to the way the^^parametefs were Set. For example, if 
the histogram is thin and spotty, the value of P14 should be raised. If 
the histogram looks like the one shown in. "B" on page 62 but there are no 
horizontal lines drawn across the page, the value of P23 should be raised. - 
Another possibility is l^at-mre iterations are needed, P05 should be 
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.•set to a hi^er value -- perh^s eight -- i£ it was six or lower in the 

run that' gave no groups. I£ P05 was set to a higher value already, it " 

should be get to a lower value.-- perhaps to £^r. 
4^ ' ' 

5. ''DANCFR. GROlfP IS TOO LARHE. GRaiP MPS OVER 100 MEMBERS . The 

* program .cannot apply the formal criteria to groups having over 100 members* 

Such^ larg^ ,groiq)s may be evidence o£ systems, with a very low degree of 

organization. QVhen random data are suhnitted to Negopy, monolithic 

groups result,) If it is felt that this is not the case, P14 can be 

lowered and'P23 raised. If this does not succeed in giving more groups, 

the links can be ^'thinned out,'^ perhaps by dropping unreciprocated links 

or raising the value of PIO, If this is not acceptable, the value of , 

P44 may be raised and the run re -submitted. 

6- ''Yai CEljklNLY HAVE A LOT OF ISOIJVTRS. ARE YOTT SITRF YOtFRF rniw . ^ 
THIS RICHT?^' This message is printed v*ienever the number of isolates 
exceeds a certain percentage of the total. It is just a warning inessage 
that indicates that the* parameters should be carefully checked for 
accuracy. / ^ ^ ' i " 

1. ^^XX PERCENT RECIPROCATION^-THAT'S VERY LOW This is a., , 

warning message sijnilar to the one in 6 above, . - ^ 



B. Adjusting the Parameters to get Better Results 
1. If the growls produced are large and loosely linked, with a lot^ 
of 5's, 6's, 7's, or even higher numbers in the distance matrix, P43 
could be raised and P24 lowered. This will cause the computer to t 
harder in t:hQ splitting of groiq)s. 
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2. If a large proportion of the nodes are ''type other/' several 
things can be done. If P03 is presently set to "lH, it could be switched^ 
to "0", causing unreciprocated links to be dropped. PIO could be raised 
to a higher value, causing weak links to be ignored*- P44 could be raised 
to a higher value, weighting links that look like within-groi^D links more 
in the vector averaging process. P05 could be either raised or lowered. 

It may not be possible to eliminate all the "others", if they 
really do not f^Jjito a groiq) structure configuration. They just might 
not be organized enough to b,e differentiated into groi5)s. 

3. ^ If the groiQDS are very small, it may be that the caT5)uter is 
splitting them too much. If it is felt that this is the case, the ^ue 
of P43 should be lowered and the value of P24 raised. iT the groiq^s are 
too small even without splitting by the computer, the value of P23 can be 
Idwered, the value of P14 raised, and the value of P05 raised. If POi^ 
is presently set tO:.^", it could be changed to "1". 

C. Known Bugs * ' ,^ """^ ~ 

» 

There are still a few minor bug» (errors) in the program. Jiave 

IT ^ ' - 

tried to locate and fix all of these, but some are. especially resistant 

?b fix^g. The lones we are a^^arq of are: ^ f 

1. P12 somititime? prints an error message vMch -says that the value j 
supplied by the user is invalid, even when the user does not supply a 
value. - The progJ^^n^ets P12 to 1 in these cases. - • 

(Note-- this bug has been fixed since^ the maAual was written) 

^ ■ / / 
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2. ^ Sometdjnes the caiqjuter prints an error message or warning \Aen 
it is-splittiiig a group inta two parts, even v^en the message does not 
apply, ^gnore the message/ ^ 

2L When a group exceeds the limit o£ 100 members , all further . 
information pertaining to that groap may be unreliable. 

4 The integrativeness score for tree nodes may be negative numbers. 
Ignore these values. Tree nodes, like all non-participants, have inte- 
grativeness scores of zero. 

^ Integrativeness scores of maii)ers of groips having over 100 
members may^ave in them. Ignore these numbers, as they are unreliable 



(see Bug # 3) . ^ 

6. These are all the bugs that are presently known. If you think 
you have found more, please contact: 



WILLIAM D. RICHARDS 

Institute for Communication Research 

Stanford, California 94305 



PHONE: (415) 497-2755 
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PARAMETER LIST 

i 

In this section the control parameters are listed in five parts, 
which coincide with the five parts o£ the analysis. In the leftmost column 
is the par^^neter number, which is followed by the parameter uame and a 
brief description of the function of the paraiHeter, The nundiers in the 
colimi labelled **page'' .indicate which page that parameter is discussed on 
ill the manual. The next column, headed "default^' indicates the default 



value of the parameter- -the value that will be supplied \>f the conputer 
if the user does not set a value. In t^e/last column is the jnaxiinum (or 
minimum) value the parameter can take. 



I, DATA DESCRIPTION 



POl # OF NODES 

P02 # OF LINKS 

P03 RECIPROCAT 

P04 DIRECTION 



P07 DATA UNIT 



P31 #WEIGMTS~ 

P08 #OBSV/CARD 
P09. NAME-V/IDIH 



Highest legal subject number 
estijTiate of the number of links 

0 — drop unreciprocated links 

1 — add links to force reciproca- 
tion 



page def^lt maximum 



48 
48 
49 



0 — assume links are non-directed 49 

1 — assume links are directed 

40---data are on file with locals 50 

name ''DATA'' 
60 — data are on card5 

l^--only X-weights are used 5t)^Sj 

2 — --both X-weights and Y-weights 

maximum number of links pet card 51 

Or--no "names'* will be used 52 

1 — ! 'names" up to ten columns 

2- --'fnames" up to twenty columns 



40 



4095 
32,767 



10 
2 
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PIO LOW IVEIGHT lowest legal strength for links 
( after link weighting fonnula) 

Pll HI IVEIGHT , highest legal, strength for links 
( after link weighting foimila) 



page default maximum 
52 - 1 255 



52 



255 



PIS- EXPONENT 
.P37 CQNS 
P38^J^ 
P39 MY 

P40 ca 

P41 . CCY 
P42 MCPK 



these are. all values in the 
link weighting formula 
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P06 # RAW PRINT computer prints the first P06 
good links 
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P34 ^^EAN STRST 



0 — computer uses strength values 55 
as reported 

1- -- computer sets both inccming and 

outgoing strength values "for 
reciprocated links to their -meah 



10 
0 



II. PREPARATION FOR GROUP -DETECTION 

( 



P44 2 -STEP IVT 



influence of within grovtp weighting 60 
factor (see also page 29) 



P05 # .OF ITERS,, number of iterations of vector 
averaging process 



60 



1.00 
4 



III. INITIAL GROUP DETECTION 



ERIC 



Pi4 SCAN RADIUS radius o£ scanning window 
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P13 DNSTY HIST p---no density histogram printed 62 

' i---density histogram will be 
printed 

> 

P23 GRP-SNSVTY contrpls sensivity of' group detec- 62* 

• . tion' routine 
.1 . ' ' 



.3 



200 
1 



100 . 
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IV. FORMAL GROUP CRITERIA 



I 



j page default maximum 
thi/'s is the -percentage for groups 65 50.05 

12 



P36 PERIV 

P22 MIN SPLIT 

P24 SPLIT DEV 

P43 DROP- SPLIT 



smallest group computer will try to 65 
split 



computer will try to split' groups 
with SD of row means greater than 
P24 . 

if greater than one, computer will 
ranove up to this many nodes in 
attempts to split groups, 
if less than one, computer will 
remove up to this proportion of 
group in attanpts to split. 
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30 



0.10 



min^ 



V. FINAL RESULTS AND CONTROL OF OUTPUT 



P17 FlLE-vOUTPUT 0---J!i^ file output will be made 
1 — ^/ifiile output will be made 



PIS PRINTO SUP 

P19 GRID SUP 

P20 GROUP SUP 

p^i MAX anvm 

.P5Z ISOSUP 

pb DETAILS 

P35 PUNCH DECK 



0 — jpxint link list ' / 

1- - -suppress link list / 

0 — print int-ermediate group lists 

1 — suppress intermediate lists 

0 — print final group tables 

1 — suppress^ final group tables 

used for; debugging the program 

0 — include isolates in link list 

1 — suppress isolates in link list 

0 — do not' print additional details 
of group formation process 

1- ,' -print additional details 

0- --do not punch sunmary deck 

1 — punch a sunmary deck 
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69 

69 

69 

69 
69 

70 
70 



0 
0 



15 
1 

0 



